[jira] [Created] (SPARK-25286) Remove dangerous parmap

2018-08-30 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-25286: -- Summary: Remove dangerous parmap Key: SPARK-25286 URL: https://issues.apache.org/jira/browse/SPARK-25286 Project: Spark Issue Type: Improvement Compone

[jira] [Created] (SPARK-25283) A deadlock in UnionRDD

2018-08-30 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-25283: -- Summary: A deadlock in UnionRDD Key: SPARK-25283 URL: https://issues.apache.org/jira/browse/SPARK-25283 Project: Spark Issue Type: Bug Components: Spar

[jira] [Updated] (SPARK-25273) How to install testthat v1.0.2

2018-08-29 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-25273: --- Summary: How to install testthat v1.0.2 (was: How to install testthat = 1.0.2) > How to install te

[jira] [Created] (SPARK-25273) How to install testthat = 1.0.2

2018-08-29 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-25273: -- Summary: How to install testthat = 1.0.2 Key: SPARK-25273 URL: https://issues.apache.org/jira/browse/SPARK-25273 Project: Spark Issue Type: Documentation

[jira] [Created] (SPARK-25252) Support arrays of any types in to_json

2018-08-27 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-25252: -- Summary: Support arrays of any types in to_json Key: SPARK-25252 URL: https://issues.apache.org/jira/browse/SPARK-25252 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-25227) Extend functionality of to_json to support arrays of differently-typed elements

2018-08-27 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16593368#comment-16593368 ] Maxim Gekk commented on SPARK-25227: > I don't know about to_json. Maybe Maxim Gekk

[jira] [Created] (SPARK-25243) Use FailureSafeParser in from_json

2018-08-26 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-25243: -- Summary: Use FailureSafeParser in from_json Key: SPARK-25243 URL: https://issues.apache.org/jira/browse/SPARK-25243 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-25199) InferSchema "all Strings" if one of many CSVs is empty

2018-08-25 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk resolved SPARK-25199. Resolution: Cannot Reproduce > InferSchema "all Strings" if one of many CSVs is empty > --

[jira] [Commented] (SPARK-25199) InferSchema "all Strings" if one of many CSVs is empty

2018-08-25 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592704#comment-16592704 ] Maxim Gekk commented on SPARK-25199: I wasn't able to reproduce the issue on the cur

[jira] [Updated] (SPARK-25240) A deadlock in ALTER TABLE RECOVER PARTITIONS

2018-08-25 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-25240: --- Summary: A deadlock in ALTER TABLE RECOVER PARTITIONS (was: Dead-lock in ALTER TABLE RECOVER PARTIT

[jira] [Created] (SPARK-25240) Dead-lock in ALTER TABLE RECOVER PARTITIONS

2018-08-25 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-25240: -- Summary: Dead-lock in ALTER TABLE RECOVER PARTITIONS Key: SPARK-25240 URL: https://issues.apache.org/jira/browse/SPARK-25240 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-25195) Extending from_json function

2018-08-23 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16590140#comment-16590140 ] Maxim Gekk commented on SPARK-25195: This is the ticket which combines both from_jso

[jira] [Commented] (SPARK-25195) Extending from_json function

2018-08-23 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589850#comment-16589850 ] Maxim Gekk commented on SPARK-25195: > 1. Does this patch also solve problem 2, as d

[jira] [Commented] (SPARK-25195) Extending from_json function

2018-08-22 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589335#comment-16589335 ] Maxim Gekk commented on SPARK-25195: > Problem number 1: The from_json function acce

[jira] [Commented] (SPARK-17916) CSV data source treats empty string as null no matter what nullValue option is

2018-08-17 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16584357#comment-16584357 ] Maxim Gekk commented on SPARK-17916: > he default behavior in 2.3.x for csv format i

[jira] [Updated] (SPARK-25048) Pivoting by multiple columns in Scala/Java

2018-08-07 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-25048: --- Summary: Pivoting by multiple columns in Scala/Java (was: Pivoting by multiple columns) > Pivoting

[jira] [Created] (SPARK-25048) Pivoting by multiple columns

2018-08-07 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-25048: -- Summary: Pivoting by multiple columns Key: SPARK-25048 URL: https://issues.apache.org/jira/browse/SPARK-25048 Project: Spark Issue Type: Improvement Co

[jira] [Updated] (SPARK-24945) Switch to uniVocity >= 2.7.2

2018-08-02 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-24945: --- Summary: Switch to uniVocity >= 2.7.2 (was: Switch to uniVocity 2.7.2) > Switch to uniVocity >= 2.7

[jira] [Updated] (SPARK-24945) Switch to uniVocity >= 2.7.2

2018-08-02 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-24945: --- Description: The recent version 2.7.2 of uniVocity parser includes the fix: https://github.com/uniVo

[jira] [Commented] (SPARK-24777) Refactor AVRO read/write benchmark

2018-07-28 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16560896#comment-16560896 ] Maxim Gekk commented on SPARK-24777: [~Gengliang.Wang] Which benchmarks are you goin

[jira] [Created] (SPARK-24959) Do not invoke the CSV/JSON parser for empty schema

2018-07-28 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24959: -- Summary: Do not invoke the CSV/JSON parser for empty schema Key: SPARK-24959 URL: https://issues.apache.org/jira/browse/SPARK-24959 Project: Spark Issue Type: Im

[jira] [Created] (SPARK-24952) Support LZMA2 compression by Avro datasource

2018-07-27 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24952: -- Summary: Support LZMA2 compression by Avro datasource Key: SPARK-24952 URL: https://issues.apache.org/jira/browse/SPARK-24952 Project: Spark Issue Type: Improvem

[jira] [Updated] (SPARK-24945) Switch to uniVocity 2.7.2

2018-07-27 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-24945: --- Summary: Switch to uniVocity 2.7.2 (was: Switch to unoVocity 2.7.2) > Switch to uniVocity 2.7.2 > -

[jira] [Created] (SPARK-24945) Switch to unoVocity 2.7.2

2018-07-27 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24945: -- Summary: Switch to unoVocity 2.7.2 Key: SPARK-24945 URL: https://issues.apache.org/jira/browse/SPARK-24945 Project: Spark Issue Type: Improvement Compo

[jira] [Created] (SPARK-24911) SHOW CREATE TABLE drops escaping of nested column names

2018-07-24 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24911: -- Summary: SHOW CREATE TABLE drops escaping of nested column names Key: SPARK-24911 URL: https://issues.apache.org/jira/browse/SPARK-24911 Project: Spark Issue Typ

[jira] [Created] (SPARK-24881) New options - compression and compressionLevel

2018-07-22 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24881: -- Summary: New options - compression and compressionLevel Key: SPARK-24881 URL: https://issues.apache.org/jira/browse/SPARK-24881 Project: Spark Issue Type: Sub-ta

[jira] [Commented] (SPARK-24849) Convert StructType to DDL string

2018-07-18 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16548878#comment-16548878 ] Maxim Gekk commented on SPARK-24849: [~maropu] This is a part of my work on customer

[jira] [Created] (SPARK-24854) Gather all options into AvroOptions

2018-07-18 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24854: -- Summary: Gather all options into AvroOptions Key: SPARK-24854 URL: https://issues.apache.org/jira/browse/SPARK-24854 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-24849) Convert StructType to DDL string

2018-07-18 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16547876#comment-16547876 ] Maxim Gekk commented on SPARK-24849: I am working on the ticket. > Convert StructTy

[jira] [Created] (SPARK-24849) Convert StructType to DDL string

2018-07-18 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24849: -- Summary: Convert StructType to DDL string Key: SPARK-24849 URL: https://issues.apache.org/jira/browse/SPARK-24849 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-24836) New option - ignoreExtension

2018-07-17 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24836: -- Summary: New option - ignoreExtension Key: SPARK-24836 URL: https://issues.apache.org/jira/browse/SPARK-24836 Project: Spark Issue Type: Sub-task Compo

[jira] [Updated] (SPARK-24810) Fix paths to resource files in AvroSuite

2018-07-15 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-24810: --- Attachment: Screen Shot 2018-07-15 at 15.28.13.png > Fix paths to resource files in AvroSuite >

[jira] [Created] (SPARK-24810) Fix paths to resource files in AvroSuite

2018-07-15 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24810: -- Summary: Fix paths to resource files in AvroSuite Key: SPARK-24810 URL: https://issues.apache.org/jira/browse/SPARK-24810 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-24807) Adding files/jars twice: output a warning and add a note

2018-07-14 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24807: -- Summary: Adding files/jars twice: output a warning and add a note Key: SPARK-24807 URL: https://issues.apache.org/jira/browse/SPARK-24807 Project: Spark Issue Ty

[jira] [Created] (SPARK-24805) Don't ignore files without .avro extension by default

2018-07-14 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24805: -- Summary: Don't ignore files without .avro extension by default Key: SPARK-24805 URL: https://issues.apache.org/jira/browse/SPARK-24805 Project: Spark Issue Type:

[jira] [Created] (SPARK-24761) Check modifiability of config parameters

2018-07-08 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24761: -- Summary: Check modifiability of config parameters Key: SPARK-24761 URL: https://issues.apache.org/jira/browse/SPARK-24761 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-24757) Improve error message for broadcast timeouts

2018-07-07 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24757: -- Summary: Improve error message for broadcast timeouts Key: SPARK-24757 URL: https://issues.apache.org/jira/browse/SPARK-24757 Project: Spark Issue Type: Improvem

[jira] [Commented] (SPARK-24164) Support column list as the pivot column in Pivot

2018-07-02 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16530448#comment-16530448 ] Maxim Gekk commented on SPARK-24164: [~maryannxue] Are you working on the feature, o

[jira] [Created] (SPARK-24722) Column-based API for pivoting

2018-07-02 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24722: -- Summary: Column-based API for pivoting Key: SPARK-24722 URL: https://issues.apache.org/jira/browse/SPARK-24722 Project: Spark Issue Type: Improvement C

[jira] [Comment Edited] (SPARK-24642) Add a function which infers schema from a JSON column

2018-07-01 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16529030#comment-16529030 ] Maxim Gekk edited comment on SPARK-24642 at 7/1/18 10:05 AM: -

[jira] [Commented] (SPARK-24642) Add a function which infers schema from a JSON column

2018-07-01 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16529030#comment-16529030 ] Maxim Gekk commented on SPARK-24642: I created new ticket SPARK-24709 which aims to

[jira] [Resolved] (SPARK-24642) Add a function which infers schema from a JSON column

2018-07-01 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk resolved SPARK-24642. Resolution: Won't Fix > Add a function which infers schema from a JSON column > --

[jira] [Created] (SPARK-24709) Inferring schema from JSON string literal

2018-07-01 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24709: -- Summary: Inferring schema from JSON string literal Key: SPARK-24709 URL: https://issues.apache.org/jira/browse/SPARK-24709 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-23725) Improve Hadoop's LineReader to support charsets different from UTF-8

2018-06-30 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16528691#comment-16528691 ] Maxim Gekk commented on SPARK-23725: [~hyukjin.kwon] I am working on the implementat

[jira] [Resolved] (SPARK-24643) from_json should accept an aggregate function as schema

2018-06-29 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk resolved SPARK-24643. Resolution: Won't Fix > from_json should accept an aggregate function as schema >

[jira] [Commented] (SPARK-24642) Add a function which infers schema from a JSON column

2018-06-28 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526547#comment-16526547 ] Maxim Gekk commented on SPARK-24642: > I think this is too complicated and unpredict

[jira] [Commented] (SPARK-24642) Add a function which infers schema from a JSON column

2018-06-27 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524802#comment-16524802 ] Maxim Gekk commented on SPARK-24642: > Do we want this as an aggregate function? I

[jira] [Commented] (SPARK-9775) Query Mesos for number of CPUs to set default parallelism

2018-06-25 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16522064#comment-16522064 ] Maxim Gekk commented on SPARK-9775: --- Please, change another related methods like propos

[jira] [Resolved] (SPARK-24445) Schema in json format for from_json in SQL

2018-06-24 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk resolved SPARK-24445. Resolution: Won't Fix > Schema in json format for from_json in SQL > -

[jira] [Created] (SPARK-24643) from_json should accept an aggregate function as schema

2018-06-24 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24643: -- Summary: from_json should accept an aggregate function as schema Key: SPARK-24643 URL: https://issues.apache.org/jira/browse/SPARK-24643 Project: Spark Issue Typ

[jira] [Created] (SPARK-24642) Add a function which infers schema from a JSON column

2018-06-24 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24642: -- Summary: Add a function which infers schema from a JSON column Key: SPARK-24642 URL: https://issues.apache.org/jira/browse/SPARK-24642 Project: Spark Issue Type:

[jira] [Commented] (SPARK-24605) size(null) should return null

2018-06-20 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518039#comment-16518039 ] Maxim Gekk commented on SPARK-24605: I am working on a PR which introduces new behav

[jira] [Created] (SPARK-24605) size(null) should return null

2018-06-20 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24605: -- Summary: size(null) should return null Key: SPARK-24605 URL: https://issues.apache.org/jira/browse/SPARK-24605 Project: Spark Issue Type: Improvement C

[jira] [Created] (SPARK-24591) Number of cores and executors in the cluster

2018-06-18 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24591: -- Summary: Number of cores and executors in the cluster Key: SPARK-24591 URL: https://issues.apache.org/jira/browse/SPARK-24591 Project: Spark Issue Type: Improvem

[jira] [Updated] (SPARK-24571) Support literals with values of the Char type

2018-06-15 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-24571: --- Description: Currently, Spark doesn't support literals with the Char (java.lang.Character) type. Fo

[jira] [Created] (SPARK-24571) Support literals with values of the Char type

2018-06-15 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24571: -- Summary: Support literals with values of the Char type Key: SPARK-24571 URL: https://issues.apache.org/jira/browse/SPARK-24571 Project: Spark Issue Type: Improve

[jira] [Commented] (SPARK-24571) Support literals with values of the Char type

2018-06-15 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16514665#comment-16514665 ] Maxim Gekk commented on SPARK-24571: I am working on the improvement. > Support lit

[jira] [Created] (SPARK-24543) Support any DataType as DDL string for from_json's schema

2018-06-13 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24543: -- Summary: Support any DataType as DDL string for from_json's schema Key: SPARK-24543 URL: https://issues.apache.org/jira/browse/SPARK-24543 Project: Spark Issue T

[jira] [Commented] (SPARK-24543) Support any DataType as DDL string for from_json's schema

2018-06-13 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510753#comment-16510753 ] Maxim Gekk commented on SPARK-24543: I am working on the feature at the moment. > S

[jira] [Commented] (SPARK-24005) Remove usage of Scala’s parallel collection

2018-06-12 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509667#comment-16509667 ] Maxim Gekk commented on SPARK-24005: [~smilegator] I am trying to reproduce the issu

[jira] [Commented] (SPARK-24445) Schema in json format for from_json in SQL

2018-05-31 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497075#comment-16497075 ] Maxim Gekk commented on SPARK-24445: I am working on the ticket at the moment. > Sc

[jira] [Created] (SPARK-24445) Schema in json format for from_json in SQL

2018-05-31 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24445: -- Summary: Schema in json format for from_json in SQL Key: SPARK-24445 URL: https://issues.apache.org/jira/browse/SPARK-24445 Project: Spark Issue Type: Improvemen

[jira] [Resolved] (SPARK-14034) Converting to Dataset causes wrong order and values in nested array of documents

2018-05-27 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk resolved SPARK-14034. Resolution: Fixed Fix Version/s: 2.3.0 > Converting to Dataset causes wrong order and values

[jira] [Commented] (SPARK-14034) Converting to Dataset causes wrong order and values in nested array of documents

2018-05-27 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16492078#comment-16492078 ] Maxim Gekk commented on SPARK-14034: I checked on Spark 2.3: {code:scala} case class

[jira] [Resolved] (SPARK-24004) Tests of from_json for MapType

2018-05-25 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk resolved SPARK-24004. Resolution: Won't Fix > Tests of from_json for MapType > -- > >

[jira] [Resolved] (SPARK-15125) CSV data source recognizes empty quoted strings in the input as null.

2018-05-25 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk resolved SPARK-15125. Resolution: Fixed Fix Version/s: 2.4.0 The issue has been fixed by https://github.com/apach

[jira] [Reopened] (SPARK-24244) Parse only required columns of CSV file

2018-05-23 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk reopened SPARK-24244: Previous PR was reverted due flaky UnivocityParserSuite > Parse only required columns of CSV file > --

[jira] [Updated] (SPARK-24366) Improve error message for Catalyst type converters

2018-05-23 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-24366: --- Summary: Improve error message for Catalyst type converters (was: Improve error message for type con

[jira] [Updated] (SPARK-24366) Improve error message for type converting

2018-05-23 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-24366: --- Summary: Improve error message for type converting (was: Improve error message for type conversions)

[jira] [Created] (SPARK-24366) Improve error message for type conversions

2018-05-23 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24366: -- Summary: Improve error message for type conversions Key: SPARK-24366 URL: https://issues.apache.org/jira/browse/SPARK-24366 Project: Spark Issue Type: Improvemen

[jira] [Created] (SPARK-24329) Remove comments filtering before parsing of CSV files

2018-05-21 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24329: -- Summary: Remove comments filtering before parsing of CSV files Key: SPARK-24329 URL: https://issues.apache.org/jira/browse/SPARK-24329 Project: Spark Issue Type:

[jira] [Updated] (SPARK-24325) Tests for Hadoop's LinesReader

2018-05-20 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-24325: --- Description: Currently, there are no tests for [Hadoop LinesReader|https://github.com/apache/spark/b

[jira] [Created] (SPARK-24325) Tests for Hadoop's LinesReader

2018-05-20 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24325: -- Summary: Tests for Hadoop's LinesReader Key: SPARK-24325 URL: https://issues.apache.org/jira/browse/SPARK-24325 Project: Spark Issue Type: Sub-task Com

[jira] [Created] (SPARK-24276) semanticHash() returns different values for semantically the same IS IN

2018-05-14 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24276: -- Summary: semanticHash() returns different values for semantically the same IS IN Key: SPARK-24276 URL: https://issues.apache.org/jira/browse/SPARK-24276 Project: Spark

[jira] [Created] (SPARK-24269) Infer nullability rather than declaring all columns as nullable

2018-05-14 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24269: -- Summary: Infer nullability rather than declaring all columns as nullable Key: SPARK-24269 URL: https://issues.apache.org/jira/browse/SPARK-24269 Project: Spark

[jira] [Created] (SPARK-24244) Parse only required columns of CSV file

2018-05-10 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24244: -- Summary: Parse only required columns of CSV file Key: SPARK-24244 URL: https://issues.apache.org/jira/browse/SPARK-24244 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-24190) lineSep shouldn't be required in JSON write

2018-05-05 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24190: -- Summary: lineSep shouldn't be required in JSON write Key: SPARK-24190 URL: https://issues.apache.org/jira/browse/SPARK-24190 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-24171) Update comments for non-deterministic functions

2018-05-03 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24171: -- Summary: Update comments for non-deterministic functions Key: SPARK-24171 URL: https://issues.apache.org/jira/browse/SPARK-24171 Project: Spark Issue Type: Docum

[jira] [Created] (SPARK-24118) Support lineSep format independent from encoding

2018-04-29 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24118: -- Summary: Support lineSep format independent from encoding Key: SPARK-24118 URL: https://issues.apache.org/jira/browse/SPARK-24118 Project: Spark Issue Type: Sub-

[jira] [Commented] (SPARK-24068) CSV schema inferring doesn't work for compressed files

2018-04-26 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16454761#comment-16454761 ] Maxim Gekk commented on SPARK-24068: The same issue exists in JSON datasource. [~hyuk

[jira] [Created] (SPARK-24068) CSV schema inferring doesn't work for compressed files

2018-04-24 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24068: -- Summary: CSV schema inferring doesn't work for compressed files Key: SPARK-24068 URL: https://issues.apache.org/jira/browse/SPARK-24068 Project: Spark Issue Type

[jira] [Created] (SPARK-24027) Support MapType(StringType, DataType) as root type by from_json

2018-04-19 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24027: -- Summary: Support MapType(StringType, DataType) as root type by from_json Key: SPARK-24027 URL: https://issues.apache.org/jira/browse/SPARK-24027 Project: Spark

[jira] [Created] (SPARK-24004) Tests of from_json for MapType

2018-04-17 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24004: -- Summary: Tests of from_json for MapType Key: SPARK-24004 URL: https://issues.apache.org/jira/browse/SPARK-24004 Project: Spark Issue Type: Test Compone

[jira] [Created] (SPARK-23849) Tests for the samplingRatio option of json schema inferring

2018-04-02 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-23849: -- Summary: Tests for the samplingRatio option of json schema inferring Key: SPARK-23849 URL: https://issues.apache.org/jira/browse/SPARK-23849 Project: Spark Issu

[jira] [Created] (SPARK-23846) samplingRatio for schema inferring of CSV datasource

2018-04-02 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-23846: -- Summary: samplingRatio for schema inferring of CSV datasource Key: SPARK-23846 URL: https://issues.apache.org/jira/browse/SPARK-23846 Project: Spark Issue Type:

[jira] [Created] (SPARK-23786) CSV schema validation - column names are not checked

2018-03-23 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-23786: -- Summary: CSV schema validation - column names are not checked Key: SPARK-23786 URL: https://issues.apache.org/jira/browse/SPARK-23786 Project: Spark Issue Type:

[jira] [Created] (SPARK-23741) UTF8String must not return chars disallowed in UTF-8

2018-03-19 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-23741: -- Summary: UTF8String must not return chars disallowed in UTF-8 Key: SPARK-23741 URL: https://issues.apache.org/jira/browse/SPARK-23741 Project: Spark Issue Type:

[jira] [Commented] (SPARK-23724) Custom record separator for jsons in charsets different from UTF-8

2018-03-19 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16404815#comment-16404815 ] Maxim Gekk commented on SPARK-23724: For this issue I plan to propose this PR: https

[jira] [Created] (SPARK-23725) Improve Hadoop's LineReader to support charsets different from UTF-8

2018-03-17 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-23725: -- Summary: Improve Hadoop's LineReader to support charsets different from UTF-8 Key: SPARK-23725 URL: https://issues.apache.org/jira/browse/SPARK-23725 Project: Spark

[jira] [Created] (SPARK-23724) Custom record separator for jsons in charsets different from UTF-8

2018-03-17 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-23724: -- Summary: Custom record separator for jsons in charsets different from UTF-8 Key: SPARK-23724 URL: https://issues.apache.org/jira/browse/SPARK-23724 Project: Spark

[jira] [Created] (SPARK-23723) New charset option for json datasource

2018-03-17 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-23723: -- Summary: New charset option for json datasource Key: SPARK-23723 URL: https://issues.apache.org/jira/browse/SPARK-23723 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-23649) CSV schema inferring fails on some UTF-8 chars

2018-03-11 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-23649: --- Shepherd: Herman van Hovell > CSV schema inferring fails on some UTF-8 chars > --

[jira] [Updated] (SPARK-23649) CSV schema inferring fails on some UTF-8 chars

2018-03-11 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-23649: --- Attachment: utf8xFF.csv > CSV schema inferring fails on some UTF-8 chars > --

[jira] [Created] (SPARK-23649) CSV schema inferring fails on some UTF-8 chars

2018-03-11 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-23649: -- Summary: CSV schema inferring fails on some UTF-8 chars Key: SPARK-23649 URL: https://issues.apache.org/jira/browse/SPARK-23649 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-23643) XORShiftRandom.hashSeed allocates unnecessary memory

2018-03-10 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-23643: --- Description: The hashSeed method allocates 64 bytes buffer and puts only 8 bytes of the seed paramete

[jira] [Updated] (SPARK-23643) XORShiftRandom.hashSeed allocates unnecessary memory

2018-03-10 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-23643: --- Summary: XORShiftRandom.hashSeed allocates unnecessary memory (was: XORShiftRandom.setSeed allocates

[jira] [Created] (SPARK-23643) XORShiftRandom.setSeed allocates unnecessary memory

2018-03-10 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-23643: -- Summary: XORShiftRandom.setSeed allocates unnecessary memory Key: SPARK-23643 URL: https://issues.apache.org/jira/browse/SPARK-23643 Project: Spark Issue Type: B

[jira] [Created] (SPARK-23620) Split thread dump lines by using the br tag

2018-03-07 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-23620: -- Summary: Split thread dump lines by using the br tag Key: SPARK-23620 URL: https://issues.apache.org/jira/browse/SPARK-23620 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-23410) Unable to read jsons in charset different from UTF-8

2018-02-15 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16366547#comment-16366547 ] Maxim Gekk commented on SPARK-23410: [~sameerag] It is not blocker anymore. I unset t

[jira] [Updated] (SPARK-23410) Unable to read jsons in charset different from UTF-8

2018-02-15 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-23410: --- Priority: Major (was: Blocker) > Unable to read jsons in charset different from UTF-8 >

<    6   7   8   9   10   11   12   >