[jira] [Created] (SPARK-22181) ReplaceExceptWithNotFilter if one or both of the datasets are fully derived out of Filters from a same parent

2017-10-01 Thread Sathiya Kumar (JIRA)
Sathiya Kumar created SPARK-22181: - Summary: ReplaceExceptWithNotFilter if one or both of the datasets are fully derived out of Filters from a same parent Key: SPARK-22181 URL:

[jira] [Commented] (SPARK-22080) Allow developers to add pre-optimisation rules

2017-09-20 Thread Sathiya Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16173313#comment-16173313 ] Sathiya Kumar commented on SPARK-22080: --- Here is a PR with the proposed changes:

[jira] [Updated] (SPARK-22080) Allow developers to add pre-optimisation rules

2017-09-20 Thread Sathiya Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sathiya Kumar updated SPARK-22080: -- Description: [SPARK-9843] added support for adding custom rules for optimising LogicalPlan,

[jira] [Created] (SPARK-22080) Allow developers to add pre-optimisation rules

2017-09-20 Thread Sathiya Kumar (JIRA)
Sathiya Kumar created SPARK-22080: - Summary: Allow developers to add pre-optimisation rules Key: SPARK-22080 URL: https://issues.apache.org/jira/browse/SPARK-22080 Project: Spark Issue Type:

[jira] [Commented] (SPARK-20589) Allow limiting task concurrency per stage

2017-08-16 Thread Amit Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16129289#comment-16129289 ] Amit Kumar commented on SPARK-20589: As you said, adding job boundary via code will be much easier

[jira] [Commented] (SPARK-20589) Allow limiting task concurrency per stage

2017-08-15 Thread Amit Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127930#comment-16127930 ] Amit Kumar commented on SPARK-20589: [~imranr] Like you said, we could restrict the number of

[jira] [Commented] (SPARK-20589) Allow limiting task concurrency per stage

2017-05-15 Thread Amit Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011024#comment-16011024 ] Amit Kumar commented on SPARK-20589: I can probably give more reference. This originally arose from

[jira] [Closed] (SPARK-20671) Processing muitple kafka topics with single spark streaming context hangs on batchSubmitted.

2017-05-09 Thread amit kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] amit kumar closed SPARK-20671. -- Resolution: Not A Problem My Bad . I configured it wrong. setMaster("local[*]") in place of

[jira] [Created] (SPARK-20671) Processing muitple kafka topics with single spark streaming context hangs on batchSubmitted.

2017-05-08 Thread amit kumar (JIRA)
amit kumar created SPARK-20671: -- Summary: Processing muitple kafka topics with single spark streaming context hangs on batchSubmitted. Key: SPARK-20671 URL: https://issues.apache.org/jira/browse/SPARK-20671

[jira] [Issue Comment Deleted] (SPARK-12180) DataFrame.join() in PySpark gives misleading exception when column name exists on both side

2017-03-08 Thread Abhishek Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Kumar updated SPARK-12180: --- Comment: was deleted (was: Is there any concrete solution or reason explaining the issue ? I

[jira] [Comment Edited] (SPARK-12180) DataFrame.join() in PySpark gives misleading exception when column name exists on both side

2017-03-08 Thread Abhishek Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902389#comment-15902389 ] Abhishek Kumar edited comment on SPARK-12180 at 3/9/17 2:52 AM: Is there

[jira] [Commented] (SPARK-12180) DataFrame.join() in PySpark gives misleading exception when column name exists on both side

2017-03-08 Thread Abhishek Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902389#comment-15902389 ] Abhishek Kumar commented on SPARK-12180: Is there any concrete solution or reason explaining the

[jira] [Updated] (SPARK-19255) SQL Listener is causing out of memory, in case of data size is in petabytes.

2017-01-21 Thread Ashok Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashok Kumar updated SPARK-19255: Description: Since its difficult to load huge dataset, below steps will help in reproducing the

[jira] [Updated] (SPARK-19255) SQL Listener is causing out of memory, in case of data size is in petabytes.

2017-01-21 Thread Ashok Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashok Kumar updated SPARK-19255: Summary: SQL Listener is causing out of memory, in case of data size is in petabytes. (was: SQL

[jira] [Commented] (SPARK-19255) SQL Listener is causing out of memory, in case of large no of shuffle partition

2017-01-19 Thread Ashok Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829715#comment-15829715 ] Ashok Kumar commented on SPARK-19255: - @Sean I will put my scenario in another way. Assume each data

[jira] [Commented] (SPARK-19255) SQL Listener is causing out of memory, in case of large no of shuffle partition

2017-01-17 Thread Ashok Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827379#comment-15827379 ] Ashok Kumar commented on SPARK-19255: - @Takeshi , thanks for looking into the issue. You are right,

[jira] [Updated] (SPARK-19255) SQL Listener is causing out of memory, in case of large no of shuffle partition

2017-01-17 Thread Ashok Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashok Kumar updated SPARK-19255: Attachment: spark_sqllistener_oom.png Attached snapshot is heap dump profiled by eclipse mat >

[jira] [Commented] (SPARK-19255) SQL Listener is causing out of memory, in case of large no of shuffle partition

2017-01-16 Thread Ashok Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15825476#comment-15825476 ] Ashok Kumar commented on SPARK-19255: - Due to issue, i am not able to attach screenshot. I will

[jira] [Created] (SPARK-19255) SQL Listener is causing out of memory, in case of large no of shuffle partition

2017-01-16 Thread Ashok Kumar (JIRA)
Ashok Kumar created SPARK-19255: --- Summary: SQL Listener is causing out of memory, in case of large no of shuffle partition Key: SPARK-19255 URL: https://issues.apache.org/jira/browse/SPARK-19255

[jira] [Commented] (SPARK-18986) ExternalAppendOnlyMap shouldn't fail when forced to spill before calling its iterator

2016-12-29 Thread Sameer Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786971#comment-15786971 ] Sameer Kumar commented on SPARK-18986: -- Shouldn't the priority be increased for this because I am

[jira] [Commented] (SPARK-18200) GraphX Invalid initial capacity when running triangleCount

2016-12-03 Thread Sumesh Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15719170#comment-15719170 ] Sumesh Kumar commented on SPARK-18200: -- Thanks much [~dongjoon] > GraphX Invalid initial capacity

[jira] [Commented] (SPARK-18200) GraphX Invalid initial capacity when running triangleCount

2016-12-03 Thread Sumesh Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15719081#comment-15719081 ] Sumesh Kumar commented on SPARK-18200: -- Does this issue exist currently in version 2.0.1?. I just

[jira] [Commented] (SPARK-18489) Implicit type conversion during comparision between Integer type column and String type column

2016-11-17 Thread Bipul Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15673547#comment-15673547 ] Bipul Kumar commented on SPARK-18489: - ok. Thanks Herman. I ll have a look on the changes. >

[jira] [Commented] (SPARK-18489) Implicit type conversion during comparision between Integer type column and String type column

2016-11-17 Thread Bipul Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15673528#comment-15673528 ] Bipul Kumar commented on SPARK-18489: - [~hvanhovell] Will this change cover all the operators

[jira] [Comment Edited] (SPARK-18489) Implicit type conversion during comparision between Integer type column and String type column

2016-11-17 Thread Bipul Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15673469#comment-15673469 ] Bipul Kumar edited comment on SPARK-18489 at 11/17/16 11:35 AM:

[jira] [Issue Comment Deleted] (SPARK-18489) Implicit type conversion during comparision between Integer type column and String type column

2016-11-17 Thread Bipul Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bipul Kumar updated SPARK-18489: Comment: was deleted (was: [~prashant_] please review this.) > Implicit type conversion during

[jira] [Commented] (SPARK-18489) Implicit type conversion during comparision between Integer type column and String type column

2016-11-17 Thread Bipul Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15673469#comment-15673469 ] Bipul Kumar commented on SPARK-18489: - [~prashant_] please review this. > Implicit type conversion

[jira] [Commented] (SPARK-18489) Implicit type conversion during comparision between Integer type column and String type column

2016-11-17 Thread Bipul Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15673470#comment-15673470 ] Bipul Kumar commented on SPARK-18489: - [~prashant_] please review this. > Implicit type conversion

[jira] [Created] (SPARK-18489) Implicit type conversion during comparision between Integer type column and String type column

2016-11-17 Thread Bipul Kumar (JIRA)
Bipul Kumar created SPARK-18489: --- Summary: Implicit type conversion during comparision between Integer type column and String type column Key: SPARK-18489 URL: https://issues.apache.org/jira/browse/SPARK-18489

[jira] [Commented] (SPARK-18150) Spark 2.* failes to create partitions for avro files

2016-10-28 Thread Sunil Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614587#comment-15614587 ] Sunil Kumar commented on SPARK-18150: - I am using Spark : 2.0.0.24 and spark-avro : 2.11-3.0.1. >

[jira] [Updated] (SPARK-18150) Spark 2.* failes to create partitions for avro files

2016-10-28 Thread Sunil Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil Kumar updated SPARK-18150: Description: I am using Apache Spark 2.0.1 for processing the Grid HDFS Avro file, however I

[jira] [Created] (SPARK-18154) CLONE - Change Source API so that sources do not need to keep unbounded state

2016-10-28 Thread Sunil Kumar (JIRA)
Sunil Kumar created SPARK-18154: --- Summary: CLONE - Change Source API so that sources do not need to keep unbounded state Key: SPARK-18154 URL: https://issues.apache.org/jira/browse/SPARK-18154 Project:

[jira] [Created] (SPARK-18151) CLONE - MetadataLog should support purging old logs

2016-10-28 Thread Sunil Kumar (JIRA)
Sunil Kumar created SPARK-18151: --- Summary: CLONE - MetadataLog should support purging old logs Key: SPARK-18151 URL: https://issues.apache.org/jira/browse/SPARK-18151 Project: Spark Issue

[jira] [Created] (SPARK-18155) CLONE - HDFSMetadataLog should not leak CRC files

2016-10-28 Thread Sunil Kumar (JIRA)
Sunil Kumar created SPARK-18155: --- Summary: CLONE - HDFSMetadataLog should not leak CRC files Key: SPARK-18155 URL: https://issues.apache.org/jira/browse/SPARK-18155 Project: Spark Issue Type:

[jira] [Created] (SPARK-18152) CLONE - FileStreamSource should not track the list of seen files indefinitely

2016-10-28 Thread Sunil Kumar (JIRA)
Sunil Kumar created SPARK-18152: --- Summary: CLONE - FileStreamSource should not track the list of seen files indefinitely Key: SPARK-18152 URL: https://issues.apache.org/jira/browse/SPARK-18152 Project:

[jira] [Created] (SPARK-18150) Spark 2.* failes to create partitions for avro files

2016-10-28 Thread Sunil Kumar (JIRA)
Sunil Kumar created SPARK-18150: --- Summary: Spark 2.* failes to create partitions for avro files Key: SPARK-18150 URL: https://issues.apache.org/jira/browse/SPARK-18150 Project: Spark Issue

[jira] [Created] (SPARK-18157) CLONE - Support purging aged file entry for FileStreamSource metadata log

2016-10-28 Thread Sunil Kumar (JIRA)
Sunil Kumar created SPARK-18157: --- Summary: CLONE - Support purging aged file entry for FileStreamSource metadata log Key: SPARK-18157 URL: https://issues.apache.org/jira/browse/SPARK-18157 Project:

[jira] [Created] (SPARK-18156) CLONE - StreamExecution should discard unneeded metadata

2016-10-28 Thread Sunil Kumar (JIRA)
Sunil Kumar created SPARK-18156: --- Summary: CLONE - StreamExecution should discard unneeded metadata Key: SPARK-18156 URL: https://issues.apache.org/jira/browse/SPARK-18156 Project: Spark Issue

[jira] [Created] (SPARK-18153) CLONE - Ability to remove old metadata for structure streaming MetadataLog

2016-10-28 Thread Sunil Kumar (JIRA)
Sunil Kumar created SPARK-18153: --- Summary: CLONE - Ability to remove old metadata for structure streaming MetadataLog Key: SPARK-18153 URL: https://issues.apache.org/jira/browse/SPARK-18153 Project:

[jira] [Updated] (SPARK-17973) is there any way to split Dataset into 2 or more based on the given condition

2016-10-17 Thread sriram kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sriram kumar updated SPARK-17973: - Description: i cannot able to split Dataset exactly with condition. i have a scenario where i

[jira] [Created] (SPARK-17973) is there any way to split Dataset into 2 or more based on the given condition

2016-10-17 Thread sriram kumar (JIRA)
sriram kumar created SPARK-17973: Summary: is there any way to split Dataset into 2 or more based on the given condition Key: SPARK-17973 URL: https://issues.apache.org/jira/browse/SPARK-17973

[jira] [Comment Edited] (SPARK-16575) partition calculation mismatch with sc.binaryFiles

2016-10-14 Thread Tarun Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15572950#comment-15572950 ] Tarun Kumar edited comment on SPARK-16575 at 10/14/16 8:38 AM: --- [~rxin] I

[jira] [Commented] (SPARK-16575) partition calculation mismatch with sc.binaryFiles

2016-10-13 Thread Tarun Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15572950#comment-15572950 ] Tarun Kumar commented on SPARK-16575: - [~rxin] I have now added the support of openCostInBytes,

[jira] [Commented] (SPARK-4105) FAILED_TO_UNCOMPRESS(5) errors when fetching shuffle data with sort-based shuffle

2016-10-07 Thread karan kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15554871#comment-15554871 ] karan kumar commented on SPARK-4105: I also ran into this error :java.io.IOException:

[jira] [Comment Edited] (SPARK-10697) Lift Calculation in Association Rule mining

2016-09-07 Thread Yashwanth Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15470104#comment-15470104 ] Yashwanth Kumar edited comment on SPARK-10697 at 9/7/16 9:33 AM: - Yes

[jira] [Comment Edited] (SPARK-10697) Lift Calculation in Association Rule mining

2016-09-07 Thread Yashwanth Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15470104#comment-15470104 ] Yashwanth Kumar edited comment on SPARK-10697 at 9/7/16 9:19 AM: - As

[jira] [Reopened] (SPARK-10697) Lift Calculation in Association Rule mining

2016-09-07 Thread Yashwanth Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yashwanth Kumar reopened SPARK-10697: - As discussed, Rule is one of the necessary measure that eliminates the concern of random

[jira] [Created] (SPARK-17146) Add RandomizedSearch to the CrossValidator API

2016-08-18 Thread Manoj Kumar (JIRA)
Manoj Kumar created SPARK-17146: --- Summary: Add RandomizedSearch to the CrossValidator API Key: SPARK-17146 URL: https://issues.apache.org/jira/browse/SPARK-17146 Project: Spark Issue Type:

[jira] [Commented] (SPARK-17116) Allow params to be a {string, value} dict at fit time

2016-08-17 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425522#comment-15425522 ] Manoj Kumar commented on SPARK-17116: - Haha, not really. I just found it odd that setParams accepts

[jira] [Created] (SPARK-17118) Make examples Python3 compatible

2016-08-17 Thread Manoj Kumar (JIRA)
Manoj Kumar created SPARK-17118: --- Summary: Make examples Python3 compatible Key: SPARK-17118 URL: https://issues.apache.org/jira/browse/SPARK-17118 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-17116) Allow params to be a {string, value} dict at fit time

2016-08-17 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Kumar updated SPARK-17116: Description: Currently, it is possible to override the default params set at constructor time by

[jira] [Comment Edited] (SPARK-17116) Allow params to be a {string, value} dict at fit time

2016-08-17 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425494#comment-15425494 ] Manoj Kumar edited comment on SPARK-17116 at 8/17/16 10:17 PM: --- [~josephkb]

[jira] [Commented] (SPARK-17116) Allow params to be a {string, value} dict at fit time

2016-08-17 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425494#comment-15425494 ] Manoj Kumar commented on SPARK-17116: - [~josephkb] [~mlnick] This is not super important, but I do

[jira] [Updated] (SPARK-17116) Allow params to be a {string, value} dict at fit time

2016-08-17 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Kumar updated SPARK-17116: Summary: Allow params to be a {string, value} dict at fit time (was: Allow params to be a

[jira] [Created] (SPARK-17116) Allow params to be a {string, value} dict

2016-08-17 Thread Manoj Kumar (JIRA)
Manoj Kumar created SPARK-17116: --- Summary: Allow params to be a {string, value} dict Key: SPARK-17116 URL: https://issues.apache.org/jira/browse/SPARK-17116 Project: Spark Issue Type:

[jira] [Created] (SPARK-16585) Update inner fields of complex types in dataframes

2016-07-16 Thread Naveen Kumar (JIRA)
Naveen Kumar created SPARK-16585: Summary: Update inner fields of complex types in dataframes Key: SPARK-16585 URL: https://issues.apache.org/jira/browse/SPARK-16585 Project: Spark Issue

[jira] [Commented] (SPARK-15347) Problem select empty ORC table

2016-07-13 Thread Sunil Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374599#comment-15374599 ] Sunil Kumar commented on SPARK-15347: - Hi , Is there any plan to fix this issue in Spark-SQL and

[jira] [Commented] (SPARK-16365) Ideas for moving "mllib-local" forward

2016-07-08 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15368602#comment-15368602 ] Manoj Kumar commented on SPARK-16365: - Could you be a bit more clearer about the first point? Is it

[jira] [Commented] (SPARK-3728) RandomForest: Learn models too large to store in memory

2016-07-08 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15368216#comment-15368216 ] Manoj Kumar commented on SPARK-3728: Hi [~xusen]. Are you still working on this? > RandomForest:

[jira] [Commented] (SPARK-16365) Ideas for moving "mllib-local" forward

2016-07-07 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366978#comment-15366978 ] Manoj Kumar commented on SPARK-16365: - Is the ultimate aim to make mllib-local, the scikit-learn of

[jira] [Commented] (SPARK-16399) Set PYSPARK_PYTHON to point to "python" instead of "python2.7"

2016-07-07 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366392#comment-15366392 ] Manoj Kumar commented on SPARK-16399: - It would just run with the default python, that is in this

[jira] [Created] (SPARK-16399) Set PYSPARK_PYTHON to point to "python" instead of "python2.7"

2016-07-06 Thread Manoj Kumar (JIRA)
Manoj Kumar created SPARK-16399: --- Summary: Set PYSPARK_PYTHON to point to "python" instead of "python2.7" Key: SPARK-16399 URL: https://issues.apache.org/jira/browse/SPARK-16399 Project: Spark

[jira] [Created] (SPARK-16307) Improve testing for DecisionTree variances

2016-06-29 Thread Manoj Kumar (JIRA)
Manoj Kumar created SPARK-16307: --- Summary: Improve testing for DecisionTree variances Key: SPARK-16307 URL: https://issues.apache.org/jira/browse/SPARK-16307 Project: Spark Issue Type: Test

[jira] [Created] (SPARK-16306) Improve testing for DecisionTree variances

2016-06-29 Thread Manoj Kumar (JIRA)
Manoj Kumar created SPARK-16306: --- Summary: Improve testing for DecisionTree variances Key: SPARK-16306 URL: https://issues.apache.org/jira/browse/SPARK-16306 Project: Spark Issue Type: Test

[jira] [Issue Comment Deleted] (SPARK-15666) Join on two tables generated from a same table throwing query analyzer issue

2016-06-27 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Kumar updated SPARK-15666: - Comment: was deleted (was: [~hvanhovell] by saying that underlying plan is broken, you mean

[jira] [Comment Edited] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-24 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15348043#comment-15348043 ] Manish Kumar edited comment on SPARK-16169 at 6/24/16 9:14 AM: --- Even if our

[jira] [Commented] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-24 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15348043#comment-15348043 ] Manish Kumar commented on SPARK-16169: -- Even if our code is asking to do more work then some task

[jira] [Comment Edited] (SPARK-14351) Optimize ImpurityAggregator for decision trees

2016-06-23 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335165#comment-15335165 ] Manoj Kumar edited comment on SPARK-14351 at 6/23/16 11:43 PM: --- OK, so here

[jira] [Comment Edited] (SPARK-14351) Optimize ImpurityAggregator for decision trees

2016-06-23 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335165#comment-15335165 ] Manoj Kumar edited comment on SPARK-14351 at 6/23/16 11:43 PM: --- OK, so here

[jira] [Commented] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346267#comment-15346267 ] Manish Kumar commented on SPARK-16169: -- Hi [~srowen] I am not saying it is taking 5 minutes longer

[jira] [Comment Edited] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346267#comment-15346267 ] Manish Kumar edited comment on SPARK-16169 at 6/23/16 10:58 AM: Hi

[jira] [Updated] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Kumar updated SPARK-16169: - Description: When a spark application is (written in scala) trying to save intermediate

[jira] [Updated] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Kumar updated SPARK-16169: - Description: When a spark application is (written in scala) trying to save intermediate

[jira] [Updated] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Kumar updated SPARK-16169: - Description: When a spark application is (written in scala) trying to save intermediate

[jira] [Updated] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Kumar updated SPARK-16169: - Description: When a spark application is (written in scala) trying to save intermediate

[jira] [Updated] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Kumar updated SPARK-16169: - Attachment: Spark-UI.png > Saving Intermediate dataframe increasing processing time upto 5

[jira] [Updated] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Kumar updated SPARK-16169: - Description: When a spark application is written in scala trying to save intermediate

[jira] [Updated] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Kumar updated SPARK-16169: - Description: When a spark application is (written in scala) trying to save intermediate

[jira] [Updated] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Kumar updated SPARK-16169: - Description: When, a spark application written in scala trying to save intermediate dataframe,

[jira] [Created] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
Manish Kumar created SPARK-16169: Summary: Saving Intermediate dataframe increasing processing time upto 5 times. Key: SPARK-16169 URL: https://issues.apache.org/jira/browse/SPARK-16169 Project:

[jira] [Issue Comment Deleted] (SPARK-14351) Optimize ImpurityAggregator for decision trees

2016-06-22 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Kumar updated SPARK-14351: Comment: was deleted (was: Here are my thoughts: Also ccing [~sethah] since he has seen this part

[jira] [Commented] (SPARK-14351) Optimize ImpurityAggregator for decision trees

2016-06-22 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345182#comment-15345182 ] Manoj Kumar commented on SPARK-14351: - Here are my thoughts: Also ccing [~sethah] since he has seen

[jira] [Commented] (SPARK-15666) Join on two tables generated from a same table throwing query analyzer issue

2016-06-20 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15339475#comment-15339475 ] Manish Kumar commented on SPARK-15666: -- [~hvanhovell] by saying that underlying plan is broken, you

[jira] [Comment Edited] (SPARK-15666) Join on two tables generated from a same table throwing query analyzer issue

2016-06-20 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15339460#comment-15339460 ] Manish Kumar edited comment on SPARK-15666 at 6/20/16 1:14 PM: --- Hi

[jira] [Commented] (SPARK-15666) Join on two tables generated from a same table throwing query analyzer issue

2016-06-20 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15339460#comment-15339460 ] Manish Kumar commented on SPARK-15666: -- Hi Herman, I haven't checked it on 2.0. But even if it

[jira] [Commented] (SPARK-14351) Optimize ImpurityAggregator for decision trees

2016-06-16 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335165#comment-15335165 ] Manoj Kumar commented on SPARK-14351: - OK, so here are some benchmarks that validate your claims

[jira] [Commented] (SPARK-14351) Optimize ImpurityAggregator for decision trees

2016-06-13 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328941#comment-15328941 ] Manoj Kumar commented on SPARK-14351: - I can try working on this. > Optimize ImpurityAggregator for

[jira] [Comment Edited] (SPARK-3155) Support DecisionTree pruning

2016-06-13 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328939#comment-15328939 ] Manoj Kumar edited comment on SPARK-3155 at 6/14/16 5:01 AM: - 1. I agree that

[jira] [Commented] (SPARK-3155) Support DecisionTree pruning

2016-06-13 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328939#comment-15328939 ] Manoj Kumar commented on SPARK-3155: 1. I agree that the use cases are limited to single trees. You

[jira] [Commented] (SPARK-3155) Support DecisionTree pruning

2016-06-13 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328592#comment-15328592 ] Manoj Kumar commented on SPARK-3155: I would like to add support for pruning DecisionTrees as part of

[jira] [Issue Comment Deleted] (SPARK-3155) Support DecisionTree pruning

2016-06-13 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Kumar updated SPARK-3155: --- Comment: was deleted (was: I would like to add support for pruning DecisionTrees as part of my

[jira] [Commented] (SPARK-3155) Support DecisionTree pruning

2016-06-13 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328487#comment-15328487 ] Manoj Kumar commented on SPARK-3155: I would like to add support for pruning DecisionTrees as part of

[jira] [Commented] (SPARK-9623) RandomForestRegressor: provide variance of predictions

2016-06-07 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15319450#comment-15319450 ] Manoj Kumar commented on SPARK-9623: [~yanboliang] Are you still working on this? Would you mind if I

[jira] [Created] (SPARK-15761) pyspark shell should load if PYSPARK_DRIVER_PYTHON is ipython an Python3

2016-06-03 Thread Manoj Kumar (JIRA)
Manoj Kumar created SPARK-15761: --- Summary: pyspark shell should load if PYSPARK_DRIVER_PYTHON is ipython an Python3 Key: SPARK-15761 URL: https://issues.apache.org/jira/browse/SPARK-15761 Project:

[jira] [Commented] (SPARK-15666) Join on two tables generated from a same table throwing query analyzer issue

2016-06-01 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15310134#comment-15310134 ] Manish Kumar commented on SPARK-15666: -- Can someone please look into this issue? As this blocking

[jira] [Created] (SPARK-15666) Join on two tables generated from a same table throwing query analyzer issue

2016-05-31 Thread Manish Kumar (JIRA)
Manish Kumar created SPARK-15666: Summary: Join on two tables generated from a same table throwing query analyzer issue Key: SPARK-15666 URL: https://issues.apache.org/jira/browse/SPARK-15666

[jira] [Created] (SPARK-15503) Not able to join two hive tables having partition using HiveContext.sql

2016-05-24 Thread Ajesh Kumar (JIRA)
Ajesh Kumar created SPARK-15503: --- Summary: Not able to join two hive tables having partition using HiveContext.sql Key: SPARK-15503 URL: https://issues.apache.org/jira/browse/SPARK-15503 Project: Spark

[jira] [Commented] (SPARK-13699) Spark SQL drops the table in "overwrite" mode while writing into table

2016-05-05 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272123#comment-15272123 ] Manish Kumar commented on SPARK-13699: -- Hi, Even I am facing a similar issue with overwrite mode. I

[jira] [Created] (SPARK-14266) Association with remote system [akka.tcp://sparkDriver@192.168.1.81:34047] has failed, address is now gated for [5000] ms. Reason is: [Association failed$

2016-03-30 Thread Pavan Kumar (JIRA)
Pavan Kumar created SPARK-14266: --- Summary: Association with remote system [akka.tcp://sparkDriver@192.168.1.81:34047] has failed, address is now gated for [5000] ms. Reason is: [Association failed$ Key: SPARK-14266

[jira] [Commented] (SPARK-5433) Spark EC2 doesn't mount local disks for all instance types

2016-03-14 Thread Geet Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194722#comment-15194722 ] Geet Kumar commented on SPARK-5433: --- The spark-ec2 script does not mount disks for the d2.xlarge

<    1   2   3   4   5   6   7   >