[jira] [Created] (SPARK-20038) FileFormatWriter.ExecuteWriteTask.releaseResources() implementations to be re-entrant

2017-03-20 Thread Steve Loughran (JIRA)
Steve Loughran created SPARK-20038: -- Summary: FileFormatWriter.ExecuteWriteTask.releaseResources() implementations to be re-entrant Key: SPARK-20038 URL: https://issues.apache.org/jira/browse/SPARK-20038

[jira] [Updated] (SPARK-19570) Allow to disable hive in pyspark shell

2017-03-20 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-19570: Component/s: SQL > Allow to disable hive in pyspark shell > -- > >

[jira] [Updated] (SPARK-19962) add DictVectorizor for DataFrame

2017-03-20 Thread yu peng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yu peng updated SPARK-19962: Issue Type: New Feature (was: Wish) > add DictVectorizor for DataFrame >

[jira] [Updated] (SPARK-20028) Implement NGrams aggregate function

2017-03-20 Thread Chenzhao Guo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chenzhao Guo updated SPARK-20028: - Description: N-grams are subsequences of length N drawn from a longer sequence. The purpose of

[jira] [Assigned] (SPARK-20028) Implement NGrams aggregate function

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20028: Assignee: Apache Spark > Implement NGrams aggregate function >

[jira] [Assigned] (SPARK-20028) Implement NGrams aggregate function

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20028: Assignee: (was: Apache Spark) > Implement NGrams aggregate function >

[jira] [Resolved] (SPARK-17791) Join reordering using star schema detection

2017-03-20 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17791. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 15363

[jira] [Updated] (SPARK-20027) Compilation fixed in java docs.

2017-03-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-20027: -- Priority: Trivial (was: Major) Doesn't need a JIRA > Compilation fixed in java docs. >

[jira] [Created] (SPARK-20028) Implement NGrams aggregate function

2017-03-20 Thread Chenzhao Guo (JIRA)
Chenzhao Guo created SPARK-20028: Summary: Implement NGrams aggregate function Key: SPARK-20028 URL: https://issues.apache.org/jira/browse/SPARK-20028 Project: Spark Issue Type: Sub-task

[jira] [Resolved] (SPARK-20022) java.lang.OutOfMemoryError: Unable to acquire 4228 bytes of memory

2017-03-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20022. --- Resolution: Not A Problem This just means you ran out of memory. The mailing list is more

[jira] [Resolved] (SPARK-20020) SparkR should support checkpointing DataFrame

2017-03-20 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-20020. -- Resolution: Fixed Assignee: Felix Cheung Fix Version/s: 2.2.0

[jira] [Commented] (SPARK-20016) SparkLauncher submit job failed after setConf with special charaters under windows

2017-03-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932295#comment-15932295 ] Sean Owen commented on SPARK-20016: --- It's not clear that the error has anything to do with this. It

[jira] [Assigned] (SPARK-19994) Wrong outputOrdering for right/full outer smj

2017-03-20 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-19994: --- Assignee: Zhenhua Wang > Wrong outputOrdering for right/full outer smj >

[jira] [Resolved] (SPARK-19994) Wrong outputOrdering for right/full outer smj

2017-03-20 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19994. - Resolution: Fixed Fix Version/s: 2.2.0 2.0.3 2.1.1

[jira] [Assigned] (SPARK-20038) FileFormatWriter.ExecuteWriteTask.releaseResources() implementations to be re-entrant

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20038: Assignee: (was: Apache Spark) > FileFormatWriter.ExecuteWriteTask.releaseResources()

[jira] [Assigned] (SPARK-20038) FileFormatWriter.ExecuteWriteTask.releaseResources() implementations to be re-entrant

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20038: Assignee: Apache Spark > FileFormatWriter.ExecuteWriteTask.releaseResources()

[jira] [Commented] (SPARK-20038) FileFormatWriter.ExecuteWriteTask.releaseResources() implementations to be re-entrant

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933430#comment-15933430 ] Apache Spark commented on SPARK-20038: -- User 'steveloughran' has created a pull request for this

[jira] [Commented] (SPARK-19970) Table owner should be USER instead of PRINCIPAL in kerberized clusters

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933613#comment-15933613 ] Apache Spark commented on SPARK-19970: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Assigned] (SPARK-19962) add DictVectorizor for DataFrame

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19962: Assignee: Apache Spark > add DictVectorizor for DataFrame >

[jira] [Assigned] (SPARK-19962) add DictVectorizor for DataFrame

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19962: Assignee: (was: Apache Spark) > add DictVectorizor for DataFrame >

[jira] [Commented] (SPARK-19962) add DictVectorizor for DataFrame

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933534#comment-15933534 ] Apache Spark commented on SPARK-19962: -- User 'yupbank' has created a pull request for this issue:

[jira] [Resolved] (SPARK-19912) String literals are not escaped while performing Hive metastore level partition pruning

2017-03-20 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19912. - Resolution: Fixed Assignee: Dongjoon Hyun Fix Version/s: 2.2.0

[jira] [Updated] (SPARK-19636) Feature parity for correlation statistics in MLlib

2017-03-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-19636: -- Shepherd: Joseph K. Bradley > Feature parity for correlation statistics in MLlib >

[jira] [Resolved] (SPARK-19980) Basic Dataset transformation on POJOs does not preserves nulls.

2017-03-20 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19980. - Resolution: Fixed Fix Version/s: 2.2.0 > Basic Dataset transformation on POJOs does not

[jira] [Resolved] (SPARK-19949) unify bad record handling in CSV and JSON

2017-03-20 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-19949. - Resolution: Fixed Fix Version/s: 2.2.0 > unify bad record handling in CSV and JSON >

[jira] [Created] (SPARK-20040) Python API for ml.stat.ChiSquareTest

2017-03-20 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-20040: - Summary: Python API for ml.stat.ChiSquareTest Key: SPARK-20040 URL: https://issues.apache.org/jira/browse/SPARK-20040 Project: Spark Issue Type:

[jira] [Updated] (SPARK-20040) Python API for ml.stat.ChiSquareTest

2017-03-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-20040: -- Description: Add PySpark wrapper for ChiSquareTest. Note that it's currently called

[jira] [Resolved] (SPARK-20024) SessionCatalog reset need to set the current database of ExternalCatalog

2017-03-20 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-20024. - Resolution: Fixed Fix Version/s: 2.2.0 > SessionCatalog reset need to set the current database of

[jira] [Resolved] (SPARK-17204) Spark 2.0 off heap RDD persistence with replication factor 2 leads to in-memory data corruption

2017-03-20 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17204. - Resolution: Fixed Assignee: Michael Allman Fix Version/s: 2.2.0

[jira] [Updated] (SPARK-19983) Getting ValidationFailureSemanticException on 'INSERT OVEWRITE'

2017-03-20 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-19983: Labels: (was: sparkSQL) > Getting ValidationFailureSemanticException on 'INSERT OVEWRITE' >

[jira] [Commented] (SPARK-19983) Getting ValidationFailureSemanticException on 'INSERT OVEWRITE'

2017-03-20 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15934070#comment-15934070 ] Xiao Li commented on SPARK-19983: - This does not sound a bug to me. This is by design. The partition

[jira] [Closed] (SPARK-19983) Getting ValidationFailureSemanticException on 'INSERT OVEWRITE'

2017-03-20 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-19983. --- Resolution: Not A Problem > Getting ValidationFailureSemanticException on 'INSERT OVEWRITE' >

[jira] [Updated] (SPARK-20017) Functions "str_to_map" and "explode" throws NPE exceptioin

2017-03-20 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-20017: Target Version/s: 2.1.1, 2.2.0 (was: 2.2.0) > Functions "str_to_map" and "explode" throws NPE exceptioin

[jira] [Updated] (SPARK-20017) Functions "str_to_map" and "explode" throws NPE exceptioin

2017-03-20 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-20017: Labels: correctness (was: ) > Functions "str_to_map" and "explode" throws NPE exceptioin >

[jira] [Updated] (SPARK-20017) Functions "str_to_map" and "explode" throws NPE exceptioin

2017-03-20 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-20017: Target Version/s: 2.2.0 > Functions "str_to_map" and "explode" throws NPE exceptioin >

[jira] [Created] (SPARK-20039) Rename ml.stat.ChiSquare to ml.stat.ChiSquareTest

2017-03-20 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-20039: - Summary: Rename ml.stat.ChiSquare to ml.stat.ChiSquareTest Key: SPARK-20039 URL: https://issues.apache.org/jira/browse/SPARK-20039 Project: Spark

[jira] [Assigned] (SPARK-19968) Use a cached instance of KafkaProducer for writing to kafka via KafkaSink.

2017-03-20 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Sharma reassigned SPARK-19968: --- Assignee: (was: Prashant Sharma) > Use a cached instance of KafkaProducer for

[jira] [Assigned] (SPARK-20039) Rename ml.stat.ChiSquare to ml.stat.ChiSquareTest

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20039: Assignee: Apache Spark (was: Joseph K. Bradley) > Rename ml.stat.ChiSquare to

[jira] [Commented] (SPARK-20039) Rename ml.stat.ChiSquare to ml.stat.ChiSquareTest

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15934107#comment-15934107 ] Apache Spark commented on SPARK-20039: -- User 'jkbradley' has created a pull request for this issue:

[jira] [Assigned] (SPARK-20039) Rename ml.stat.ChiSquare to ml.stat.ChiSquareTest

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20039: Assignee: Joseph K. Bradley (was: Apache Spark) > Rename ml.stat.ChiSquare to

[jira] [Updated] (SPARK-20024) SessionCatalog reset need to set the current database of ExternalCatalog

2017-03-20 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-20024: Summary: SessionCatalog reset need to set the current database of ExternalCatalog (was: SessionCatalog

[jira] [Updated] (SPARK-20024) SessionCatalog API setCurrentDatabase need to set the current database of ExternalCatalog

2017-03-20 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-20024: Description: SessionCatalog API setCurrentDatabase does not set the current database of the underlying

[jira] [Resolved] (SPARK-19906) Add Documentation for Kafka Write paths

2017-03-20 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-19906. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17246

[jira] [Resolved] (SPARK-20010) Sort information is lost after sort merge join

2017-03-20 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-20010. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17339

[jira] [Assigned] (SPARK-20010) Sort information is lost after sort merge join

2017-03-20 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-20010: --- Assignee: Zhenhua Wang > Sort information is lost after sort merge join >

[jira] [Assigned] (SPARK-19980) Basic Dataset transformation on POJOs does not preserves nulls.

2017-03-20 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-19980: --- Assignee: Takeshi Yamamuro > Basic Dataset transformation on POJOs does not preserves

[jira] [Commented] (SPARK-19475) (ML|MLlib).linalg.DenseVector method delegation fails for __neg__

2017-03-20 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933914#comment-15933914 ] holdenk commented on SPARK-19475: - This seems like it might be a reasonable candidate for 2.1.1 - what do

[jira] [Commented] (SPARK-19475) (ML|MLlib).linalg.DenseVector method delegation fails for __neg__

2017-03-20 Thread Maciej Szymkiewicz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933981#comment-15933981 ] Maciej Szymkiewicz commented on SPARK-19475: Sounds good [~holdenk], though I wonder if we

[jira] [Updated] (SPARK-19237) SparkR package on Windows waiting for a long time when no java is found launching spark-submit

2017-03-20 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-19237: - Target Version/s: 2.1.1 > SparkR package on Windows waiting for a long time when no java is

[jira] [Resolved] (SPARK-19573) Make NaN/null handling consistent in approxQuantile

2017-03-20 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-19573. - Resolution: Fixed Assignee: zhengruifeng Fix Version/s: 2.2.0 > Make NaN/null handling

[jira] [Assigned] (SPARK-19636) Feature parity for correlation statistics in MLlib

2017-03-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-19636: - Assignee: Timothy Hunter (was: Tim Hunter) > Feature parity for correlation

[jira] [Updated] (SPARK-19983) Getting ValidationFailureSemanticException on 'INSERT OVEWRITE'

2017-03-20 Thread Rajkumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajkumar updated SPARK-19983: - Description: Hi, I am creating a DataFrame and registering that DataFrame as temp table using

[jira] [Commented] (SPARK-10109) NPE when saving Parquet To HDFS

2017-03-20 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933324#comment-15933324 ] Steve Loughran commented on SPARK-10109: I think the cause is actually that in some codepaths, if

[jira] [Commented] (SPARK-10109) NPE when saving Parquet To HDFS

2017-03-20 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933340#comment-15933340 ] Steve Loughran commented on SPARK-10109: This is a bit related to the execution/commit mechanism;

[jira] [Commented] (SPARK-16087) Spark Hangs When Using Union With Persisted Hadoop RDD

2017-03-20 Thread Mark Heimann (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932527#comment-15932527 ] Mark Heimann commented on SPARK-16087: -- We're most likely seeing the same issue in our project using

[jira] [Created] (SPARK-20033) spark sql can not use hive permanent function

2017-03-20 Thread cen yuhai (JIRA)
cen yuhai created SPARK-20033: - Summary: spark sql can not use hive permanent function Key: SPARK-20033 URL: https://issues.apache.org/jira/browse/SPARK-20033 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-20033) spark sql can not use hive permanent function

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932578#comment-15932578 ] Apache Spark commented on SPARK-20033: -- User 'cenyuhai' has created a pull request for this issue:

[jira] [Comment Edited] (SPARK-16087) Spark Hangs When Using Union With Persisted Hadoop RDD

2017-03-20 Thread Mark Heimann (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932527#comment-15932527 ] Mark Heimann edited comment on SPARK-16087 at 3/20/17 11:40 AM: We're

[jira] [Updated] (SPARK-20029) LiR supports bound constrained optimization

2017-03-20 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-20029: Description: MLlib {{LinearRegression}} should support bound constrained optimization. Users can

[jira] [Created] (SPARK-20032) SparkIMain getClassOutputDirectory method is missing.

2017-03-20 Thread Nupur Sahu (JIRA)
Nupur Sahu created SPARK-20032: -- Summary: SparkIMain getClassOutputDirectory method is missing. Key: SPARK-20032 URL: https://issues.apache.org/jira/browse/SPARK-20032 Project: Spark Issue

[jira] [Updated] (SPARK-20032) SparkIMain getClassOutputDirectory method is missing in spark-repl2.11 with 1.4.1 version.

2017-03-20 Thread Nupur Sahu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nupur Sahu updated SPARK-20032: --- Summary: SparkIMain getClassOutputDirectory method is missing in spark-repl2.11 with 1.4.1 version.

[jira] [Assigned] (SPARK-20033) spark sql can not use hive permanent function

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20033: Assignee: (was: Apache Spark) > spark sql can not use hive permanent function >

[jira] [Assigned] (SPARK-20033) spark sql can not use hive permanent function

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20033: Assignee: Apache Spark > spark sql can not use hive permanent function >

[jira] [Updated] (SPARK-19976) DirectStream API throws OffsetOutOfRange Exception

2017-03-20 Thread Taukir (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Taukir updated SPARK-19976: --- Description: I am using following code. While data on kafka topic get deleted/retention period is over, it

[jira] [Commented] (SPARK-20031) sc.wholeTextFiles + toDebugString takes long time even before action is performed

2017-03-20 Thread Rakesh Kumar Dash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932516#comment-15932516 ] Rakesh Kumar Dash commented on SPARK-20031: --- Similar behaviour is observed for

[jira] [Commented] (SPARK-20030) Add Event Time based Timeout

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932496#comment-15932496 ] Apache Spark commented on SPARK-20030: -- User 'tdas' has created a pull request for this issue:

[jira] [Assigned] (SPARK-20030) Add Event Time based Timeout

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20030: Assignee: Tathagata Das (was: Apache Spark) > Add Event Time based Timeout >

[jira] [Assigned] (SPARK-20030) Add Event Time based Timeout

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20030: Assignee: Apache Spark (was: Tathagata Das) > Add Event Time based Timeout >

[jira] [Commented] (SPARK-18838) High latency of event processing for large jobs

2017-03-20 Thread Danny Ruchman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932238#comment-15932238 ] Danny Ruchman commented on SPARK-18838: --- Hi, We also have an issue where we query huge amount of

[jira] [Commented] (SPARK-20028) Implement NGrams aggregate function

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932254#comment-15932254 ] Apache Spark commented on SPARK-20028: -- User 'gczsjdy' has created a pull request for this issue:

[jira] [Assigned] (SPARK-15040) PySpark impl for ml.feature.Imputer

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15040: Assignee: (was: Apache Spark) > PySpark impl for ml.feature.Imputer >

[jira] [Assigned] (SPARK-15040) PySpark impl for ml.feature.Imputer

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15040: Assignee: Apache Spark > PySpark impl for ml.feature.Imputer >

[jira] [Assigned] (SPARK-20029) LiR supports bound constrained optimization

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20029: Assignee: (was: Apache Spark) > LiR supports bound constrained optimization >

[jira] [Commented] (SPARK-20029) LiR supports bound constrained optimization

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932428#comment-15932428 ] Apache Spark commented on SPARK-20029: -- User 'yanboliang' has created a pull request for this issue:

[jira] [Assigned] (SPARK-20029) LiR supports bound constrained optimization

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20029: Assignee: Apache Spark > LiR supports bound constrained optimization >

[jira] [Updated] (SPARK-20029) LiR supports bound constrained optimization

2017-03-20 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-20029: Shepherd: DB Tsai > LiR supports bound constrained optimization >

[jira] [Created] (SPARK-20031) sc.wholeTextFiles + toDebugString takes long time even before action is performed

2017-03-20 Thread Rakesh Kumar Dash (JIRA)
Rakesh Kumar Dash created SPARK-20031: - Summary: sc.wholeTextFiles + toDebugString takes long time even before action is performed Key: SPARK-20031 URL: https://issues.apache.org/jira/browse/SPARK-20031

[jira] [Resolved] (SPARK-19838) Adding Processing Time based timeout

2017-03-20 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-19838. --- Resolution: Fixed Fix Version/s: 2.2.0 https://github.com/apache/spark/pull/17179 >

[jira] [Created] (SPARK-20030) Add Event Time based Timeout

2017-03-20 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-20030: - Summary: Add Event Time based Timeout Key: SPARK-20030 URL: https://issues.apache.org/jira/browse/SPARK-20030 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-20029) LiR supports bound constrained optimization

2017-03-20 Thread Yanbo Liang (JIRA)
Yanbo Liang created SPARK-20029: --- Summary: LiR supports bound constrained optimization Key: SPARK-20029 URL: https://issues.apache.org/jira/browse/SPARK-20029 Project: Spark Issue Type:

[jira] [Updated] (SPARK-20029) LiR supports bound constrained optimization

2017-03-20 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-20029: Description: MLlib {{LinearRegression}} should support bound constrained optimization. Users can

[jira] [Commented] (SPARK-19970) Table owner should be USER instead of PRINCIPAL in kerberized clusters

2017-03-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933152#comment-15933152 ] Apache Spark commented on SPARK-19970: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Assigned] (SPARK-19899) FPGrowth input column naming

2017-03-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-19899: - Assignee: Maciej Szymkiewicz > FPGrowth input column naming >

[jira] [Updated] (SPARK-20035) Spark 2.0.2 writes empty file if no record is in the dataset

2017-03-20 Thread Andrew (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew updated SPARK-20035: --- Description: When there is no record in a dataset, the call to write with the spark-csv creates empty file

[jira] [Resolved] (SPARK-19899) FPGrowth input column naming

2017-03-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-19899. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17321

[jira] [Created] (SPARK-20036) impossible to read a whole kafka topic using kafka 0.10 and spark 2.0.0

2017-03-20 Thread Daniel Nuriyev (JIRA)
Daniel Nuriyev created SPARK-20036: -- Summary: impossible to read a whole kafka topic using kafka 0.10 and spark 2.0.0 Key: SPARK-20036 URL: https://issues.apache.org/jira/browse/SPARK-20036

[jira] [Updated] (SPARK-20036) impossible to read a whole kafka topic using kafka 0.10 and spark 2.0.0

2017-03-20 Thread Daniel Nuriyev (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Nuriyev updated SPARK-20036: --- Component/s: (was: Spark Core) Input/Output > impossible to read a

[jira] [Updated] (SPARK-20036) impossible to read a whole kafka topic using kafka 0.10 and spark 2.0.0

2017-03-20 Thread Daniel Nuriyev (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Nuriyev updated SPARK-20036: --- Description: I use kafka 0.10.1 and java code with the following dependencies:

[jira] [Created] (SPARK-20037) impossible to set kafka offsets using kafka 0.10 and spark 2.0.0

2017-03-20 Thread Daniel Nuriyev (JIRA)
Daniel Nuriyev created SPARK-20037: -- Summary: impossible to set kafka offsets using kafka 0.10 and spark 2.0.0 Key: SPARK-20037 URL: https://issues.apache.org/jira/browse/SPARK-20037 Project: Spark

[jira] [Updated] (SPARK-20036) impossible to read a whole kafka topic using kafka 0.10 and spark 2.0.0

2017-03-20 Thread Daniel Nuriyev (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Nuriyev updated SPARK-20036: --- Description: I use kafka 0.10.1 and java code with the following dependencies:

[jira] [Updated] (SPARK-20036) impossible to read a whole kafka topic using kafka 0.10 and spark 2.0.0

2017-03-20 Thread Daniel Nuriyev (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Nuriyev updated SPARK-20036: --- Description: I use kafka 0.10.1 and java code with the following dependencies:

[jira] [Comment Edited] (SPARK-20019) spark can not load alluxio fileSystem after adding jar

2017-03-20 Thread roncenzhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932621#comment-15932621 ] roncenzhao edited comment on SPARK-20019 at 3/20/17 1:25 PM: - [~srowen] I

[jira] [Resolved] (SPARK-19990) Flaky test: org.apache.spark.sql.hive.execution.HiveCatalogedDDLSuite: create temporary view using

2017-03-20 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19990. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17338

[jira] [Assigned] (SPARK-19990) Flaky test: org.apache.spark.sql.hive.execution.HiveCatalogedDDLSuite: create temporary view using

2017-03-20 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-19990: --- Assignee: Song Jun > Flaky test: org.apache.spark.sql.hive.execution.HiveCatalogedDDLSuite:

[jira] [Resolved] (SPARK-20031) sc.wholeTextFiles + toDebugString takes long time even before action is performed

2017-03-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20031. --- Resolution: Invalid "Why is this slow?" isn't suitable for JIRA. You'd want to start by asking on

[jira] [Commented] (SPARK-14388) Create Table

2017-03-20 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932960#comment-15932960 ] Yin Huai commented on SPARK-14388: -- [~erlu] I see. Can you create a jira for this? Let's put an example

[jira] [Created] (SPARK-20035) spark-csv writes empty file if no record is in the dataset

2017-03-20 Thread Andrew (JIRA)
Andrew created SPARK-20035: -- Summary: spark-csv writes empty file if no record is in the dataset Key: SPARK-20035 URL: https://issues.apache.org/jira/browse/SPARK-20035 Project: Spark Issue Type:

[jira] [Commented] (SPARK-20032) SparkIMain getClassOutputDirectory method is missing in spark-repl2.11 with 1.4.1 version.

2017-03-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932673#comment-15932673 ] Sean Owen commented on SPARK-20032: --- There are no API methods here, and it's not expected that the

[jira] [Commented] (SPARK-20019) spark can not load alluxio fileSystem after adding jar

2017-03-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932691#comment-15932691 ] Sean Owen commented on SPARK-20019: --- I mean I believe you need to attach this to your app with --jars,

[jira] [Commented] (SPARK-20032) SparkIMain getClassOutputDirectory method is missing in spark-repl2.11 with 1.4.1 version.

2017-03-20 Thread Nupur Sahu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932676#comment-15932676 ] Nupur Sahu commented on SPARK-20032: So is there any way i can find where the complied class file is

  1   2   >