[jira] [Commented] (SPARK-1520) Assembly Jar with more than 65536 files won't work when compiled on JDK7 and run on JDK6

2014-05-04 Thread Koert Kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989240#comment-13989240 ] Koert Kuipers commented on SPARK-1520: -- Xiangrui, I have to stick to sun java.

[jira] [Commented] (SPARK-1520) Assembly Jar with more than 65536 files won't work when compiled on JDK7 and run on JDK6

2014-05-05 Thread Koert Kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989430#comment-13989430 ] Koert Kuipers commented on SPARK-1520: -- I will try latest master. thanks Assembly

[jira] [Created] (SPARK-1801) Open up sime private APIs related to creating new RDDs for developers

2014-05-11 Thread koert kuipers (JIRA)
koert kuipers created SPARK-1801: Summary: Open up sime private APIs related to creating new RDDs for developers Key: SPARK-1801 URL: https://issues.apache.org/jira/browse/SPARK-1801 Project: Spark

[jira] [Created] (SPARK-1811) Support resizable output buffer for kryo serializer

2014-05-12 Thread koert kuipers (JIRA)
koert kuipers created SPARK-1811: Summary: Support resizable output buffer for kryo serializer Key: SPARK-1811 URL: https://issues.apache.org/jira/browse/SPARK-1811 Project: Spark Issue

[jira] [Created] (SPARK-1863) Allowing user jars to take precedence over Spark jars does not work as expected

2014-05-16 Thread koert kuipers (JIRA)
koert kuipers created SPARK-1863: Summary: Allowing user jars to take precedence over Spark jars does not work as expected Key: SPARK-1863 URL: https://issues.apache.org/jira/browse/SPARK-1863

[jira] [Updated] (SPARK-1863) Allowing user jars to take precedence over Spark jars does not work as expected

2014-05-16 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] koert kuipers updated SPARK-1863: - Description: See here:

[jira] [Created] (SPARK-2543) Resizable serialization buffers for kryo

2014-07-16 Thread koert kuipers (JIRA)
koert kuipers created SPARK-2543: Summary: Resizable serialization buffers for kryo Key: SPARK-2543 URL: https://issues.apache.org/jira/browse/SPARK-2543 Project: Spark Issue Type:

[jira] [Commented] (SPARK-1855) Provide memory-and-local-disk RDD checkpointing

2014-07-24 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14073784#comment-14073784 ] koert kuipers commented on SPARK-1855: -- i think this makes sense. we have iterative

[jira] [Created] (SPARK-3655) Secondary sort

2014-09-22 Thread koert kuipers (JIRA)
koert kuipers created SPARK-3655: Summary: Secondary sort Key: SPARK-3655 URL: https://issues.apache.org/jira/browse/SPARK-3655 Project: Spark Issue Type: New Feature Components:

[jira] [Commented] (SPARK-3655) Secondary sort

2014-10-22 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14180155#comment-14180155 ] koert kuipers commented on SPARK-3655: -- i am not sure

[jira] [Comment Edited] (SPARK-3655) Secondary sort

2014-10-22 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14180155#comment-14180155 ] koert kuipers edited comment on SPARK-3655 at 10/22/14 4:54 PM:

[jira] [Commented] (SPARK-3655) Secondary sort

2014-10-22 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14180348#comment-14180348 ] koert kuipers commented on SPARK-3655: -- i went through the code. to allow a secondary

[jira] [Commented] (SPARK-3655) Secondary sort

2014-10-23 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14181248#comment-14181248 ] koert kuipers commented on SPARK-3655: -- hey patrick. i was looking into modifying the

[jira] [Commented] (SPARK-3655) Secondary sort

2014-10-23 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14181807#comment-14181807 ] koert kuipers commented on SPARK-3655: -- yes, that makes sense. i am working right

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2014-10-26 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184575#comment-14184575 ] koert kuipers commented on SPARK-3655: -- can you assign to me? i will have 2 pullreq

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2014-10-27 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14185699#comment-14185699 ] koert kuipers commented on SPARK-3655: -- first pullreq is here:

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2014-10-27 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14185769#comment-14185769 ] koert kuipers commented on SPARK-3655: -- second pullreq is here:

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2014-12-05 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14235766#comment-14235766 ] koert kuipers commented on SPARK-3655: -- something that takes in an ordering, and

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2014-12-05 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14236015#comment-14236015 ] koert kuipers commented on SPARK-3655: -- should there be a foldLeft that does not

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2014-12-05 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14236267#comment-14236267 ] koert kuipers commented on SPARK-3655: -- [~sandyr] i updated pullreq to include

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2014-12-07 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237254#comment-14237254 ] koert kuipers commented on SPARK-3655: -- i also dont like the signature def

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2014-12-07 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237299#comment-14237299 ] koert kuipers commented on SPARK-3655: -- i have a new pullreq that implements just

[jira] [Updated] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2014-12-07 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] koert kuipers updated SPARK-3655: - Affects Version/s: 1.2.0 Support sorting of values in addition to keys (i.e. secondary sort)

[jira] [Comment Edited] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2014-12-07 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237299#comment-14237299 ] koert kuipers edited comment on SPARK-3655 at 12/8/14 3:24 AM:

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2014-12-08 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238685#comment-14238685 ] koert kuipers commented on SPARK-3655: -- OK that can be done. It definitely highlights

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2014-12-08 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238721#comment-14238721 ] koert kuipers commented on SPARK-3655: -- I will update the pullrequest to put out a

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2014-12-11 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242547#comment-14242547 ] koert kuipers commented on SPARK-3655: -- i updated the pullreq to use Iterables

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2014-12-19 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14254337#comment-14254337 ] koert kuipers commented on SPARK-3655: -- Imran, Thanks for taking the time to write

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2014-12-20 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14254973#comment-14254973 ] koert kuipers commented on SPARK-3655: -- Imran, I think the groupAndSort function is

[jira] [Commented] (SPARK-2808) update kafka to version 0.8.2

2015-02-05 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307254#comment-14307254 ] koert kuipers commented on SPARK-2808: -- what is the motivation for this upgrade?

[jira] [Comment Edited] (SPARK-2808) update kafka to version 0.8.2

2015-02-05 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307254#comment-14307254 ] koert kuipers edited comment on SPARK-2808 at 2/5/15 2:28 PM: --

[jira] [Comment Edited] (SPARK-2808) update kafka to version 0.8.2

2015-02-11 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317240#comment-14317240 ] koert kuipers edited comment on SPARK-2808 at 2/12/15 12:05 AM:

[jira] [Commented] (SPARK-2808) update kafka to version 0.8.2

2015-02-11 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317240#comment-14317240 ] koert kuipers commented on SPARK-2808: -- scala 2.11, thats good point, i didnt think

[jira] [Commented] (SPARK-3306) Addition of external resource dependency in executors

2015-01-03 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263761#comment-14263761 ] koert kuipers commented on SPARK-3306: -- i am also interested in resources that can be

[jira] [Comment Edited] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2015-04-28 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517946#comment-14517946 ] koert kuipers edited comment on SPARK-3655 at 4/28/15 8:18 PM:

[jira] [Comment Edited] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2015-04-28 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517946#comment-14517946 ] koert kuipers edited comment on SPARK-3655 at 4/28/15 8:19 PM:

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2015-04-28 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517946#comment-14517946 ] koert kuipers commented on SPARK-3655: -- since the last pullreq for this ticket i

[jira] [Created] (SPARK-8577) ScalaReflectionLock.synchronized can cause deadlock

2015-06-23 Thread koert kuipers (JIRA)
koert kuipers created SPARK-8577: Summary: ScalaReflectionLock.synchronized can cause deadlock Key: SPARK-8577 URL: https://issues.apache.org/jira/browse/SPARK-8577 Project: Spark Issue

[jira] [Commented] (SPARK-4644) Implement skewed join

2015-06-18 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591929#comment-14591929 ] koert kuipers commented on SPARK-4644: -- i believe (plz correct me if i am wrong) that

[jira] [Created] (SPARK-8398) Consistently expose Hadoop Configuration/JobConf parameters for Hadoop input/output formats

2015-06-16 Thread koert kuipers (JIRA)
koert kuipers created SPARK-8398: Summary: Consistently expose Hadoop Configuration/JobConf parameters for Hadoop input/output formats Key: SPARK-8398 URL: https://issues.apache.org/jira/browse/SPARK-8398

[jira] [Commented] (SPARK-8817) DataFrame should not allow duplicate colum names

2015-07-03 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613389#comment-14613389 ] koert kuipers commented on SPARK-8817: -- its also worth pointing out that panda's

[jira] [Created] (SPARK-8817) DataFrame should not allow duplicate colum names

2015-07-03 Thread koert kuipers (JIRA)
koert kuipers created SPARK-8817: Summary: DataFrame should not allow duplicate colum names Key: SPARK-8817 URL: https://issues.apache.org/jira/browse/SPARK-8817 Project: Spark Issue Type:

[jira] [Commented] (SPARK-8577) ScalaReflectionLock.synchronized can cause deadlock

2015-06-28 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604935#comment-14604935 ] koert kuipers commented on SPARK-8577: -- i do not have a way to reproduce this in

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2015-08-12 Thread Koert Kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694302#comment-14694302 ] Koert Kuipers commented on SPARK-3655: -- it depends on the size of your values per

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2015-08-12 Thread Koert Kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694499#comment-14694499 ] Koert Kuipers commented on SPARK-3655: -- i assume you want to do some analysis on the

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2015-08-21 Thread Koert Kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706817#comment-14706817 ] Koert Kuipers commented on SPARK-3655: -- hey nick, i believe your problem sounds like

[jira] [Created] (SPARK-10185) Spark SQL does not handle comma separates paths on Hadoop FileSystem

2015-08-24 Thread koert kuipers (JIRA)
koert kuipers created SPARK-10185: - Summary: Spark SQL does not handle comma separates paths on Hadoop FileSystem Key: SPARK-10185 URL: https://issues.apache.org/jira/browse/SPARK-10185 Project:

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2015-08-25 Thread Koert Kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14711668#comment-14711668 ] Koert Kuipers commented on SPARK-3655: -- oh, thats no good i am using guava without

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2015-08-25 Thread Koert Kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14711701#comment-14711701 ] Koert Kuipers commented on SPARK-3655: -- Great. We have stress tested it with millions

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2015-08-25 Thread Koert Kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14711382#comment-14711382 ] Koert Kuipers commented on SPARK-3655: -- glad to hear it worked well. totally agree

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2015-10-28 Thread Koert Kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979713#comment-14979713 ] Koert Kuipers commented on SPARK-3655: -- say if your input is sessionId|json and you have a way to

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2015-10-28 Thread Koert Kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978000#comment-14978000 ] Koert Kuipers commented on SPARK-3655: -- spark-sorted (https://github.com/tresata/spark-sorted) allows

[jira] [Commented] (SPARK-11441) HadoopFsRelation is not scalable in number of files read/written

2015-11-12 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15002751#comment-15002751 ] koert kuipers commented on SPARK-11441: --- going over the code base it seems that there are 2

[jira] [Created] (SPARK-11441) HadoopFsRelation is not scalable in number of files read/written

2015-11-01 Thread koert kuipers (JIRA)
koert kuipers created SPARK-11441: - Summary: HadoopFsRelation is not scalable in number of files read/written Key: SPARK-11441 URL: https://issues.apache.org/jira/browse/SPARK-11441 Project: Spark

[jira] [Commented] (SPARK-10185) Spark SQL does not handle comma separates paths on Hadoop FileSystem

2015-10-06 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14945180#comment-14945180 ] koert kuipers commented on SPARK-10185: --- someone else is also running into this:

[jira] [Commented] (SPARK-10185) Spark SQL does not handle comma separates paths on Hadoop FileSystem

2015-10-08 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949931#comment-14949931 ] koert kuipers commented on SPARK-10185: --- [~marmbrus] made it clear that the goal is to support

[jira] [Commented] (SPARK-5741) Support the path contains comma in HiveContext

2015-08-27 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717235#comment-14717235 ] koert kuipers commented on SPARK-5741: -- i am reading avro and csv mostly. but we try

[jira] [Commented] (SPARK-5741) Support the path contains comma in HiveContext

2015-08-27 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717053#comment-14717053 ] koert kuipers commented on SPARK-5741: -- i realize i am late to the party but... by

[jira] [Comment Edited] (SPARK-5741) Support the path contains comma in HiveContext

2015-08-27 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717450#comment-14717450 ] koert kuipers edited comment on SPARK-5741 at 8/27/15 8:22 PM:

[jira] [Commented] (SPARK-5741) Support the path contains comma in HiveContext

2015-08-27 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717450#comment-14717450 ] koert kuipers commented on SPARK-5741: -- given the requirement of source/binary

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2015-09-01 Thread Koert Kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14726664#comment-14726664 ] Koert Kuipers commented on SPARK-3655: -- Did you build a version that does not use Optional for java

[jira] [Commented] (SPARK-603) add simple Counter API

2015-09-08 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14736126#comment-14736126 ] koert kuipers commented on SPARK-603: - we use counters a lot in scalding (to verify records counts

[jira] [Commented] (SPARK-732) Recomputation of RDDs may result in duplicated accumulator updates

2015-09-08 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14736120#comment-14736120 ] koert kuipers commented on SPARK-732: - It is not clear to me what the usage of accumulators is without

[jira] [Commented] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2015-08-25 Thread Koert Kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14711986#comment-14711986 ] Koert Kuipers commented on SPARK-3655: -- i believe its straightforward to get rid of

[jira] [Commented] (SPARK-11441) HadoopFsRelation is not scalable in number of files read/written

2015-11-29 Thread Koert Kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15031187#comment-15031187 ] Koert Kuipers commented on SPARK-11441: --- Besides a flag to disable the cache a version of buildScan

[jira] [Commented] (SPARK-11967) Use varargs for multiple paths in DataFrameReader

2015-11-24 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025421#comment-15025421 ] koert kuipers commented on SPARK-11967: --- i found the comment in my pullreq: * calling the function

[jira] [Commented] (SPARK-11967) Use varargs for multiple paths in DataFrameReader

2015-11-24 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025424#comment-15025424 ] koert kuipers commented on SPARK-11967: --- agreed that varargs is easier. thanks > Use varargs for

[jira] [Commented] (SPARK-11967) Use varargs for multiple paths in DataFrameReader

2015-11-24 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025413#comment-15025413 ] koert kuipers commented on SPARK-11967: --- i think i had varargs originally, and then someone asked

[jira] [Commented] (SPARK-11441) HadoopFsRelation is not scalable in number of files read/written

2015-11-27 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030385#comment-15030385 ] koert kuipers commented on SPARK-11441: --- i was able to rather trivially remove the cache of

[jira] [Commented] (SPARK-11441) HadoopFsRelation is not scalable in number of files read/written

2015-11-17 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15009690#comment-15009690 ] koert kuipers commented on SPARK-11441: --- one more place where the cache of FileStatus objects for

[jira] [Updated] (SPARK-15769) Add Encoder for input type to Aggregator

2016-06-04 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] koert kuipers updated SPARK-15769: -- Description: Currently org.apache.spark.sql.expressions.Aggregator has Encoders for its

[jira] [Created] (SPARK-15769) Add Encoder for input type to Aggregator

2016-06-04 Thread koert kuipers (JIRA)
koert kuipers created SPARK-15769: - Summary: Add Encoder for input type to Aggregator Key: SPARK-15769 URL: https://issues.apache.org/jira/browse/SPARK-15769 Project: Spark Issue Type:

[jira] [Commented] (SPARK-15507) ClassCastException: SomeCaseClass cannot be cast to org.apache.spark.sql.Row

2016-06-04 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15315637#comment-15315637 ] koert kuipers commented on SPARK-15507: --- that doesnt work for RDD[Row], and we update items within

[jira] [Comment Edited] (SPARK-15507) ClassCastException: SomeCaseClass cannot be cast to org.apache.spark.sql.Row

2016-06-04 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15315637#comment-15315637 ] koert kuipers edited comment on SPARK-15507 at 6/4/16 9:14 PM: --- that doesnt

[jira] [Created] (SPARK-15798) Secondary sort in Dataset/DataFrame

2016-06-06 Thread koert kuipers (JIRA)
koert kuipers created SPARK-15798: - Summary: Secondary sort in Dataset/DataFrame Key: SPARK-15798 URL: https://issues.apache.org/jira/browse/SPARK-15798 Project: Spark Issue Type: New

[jira] [Created] (SPARK-15810) Aggregator doesn't play nice with Option

2016-06-07 Thread koert kuipers (JIRA)
koert kuipers created SPARK-15810: - Summary: Aggregator doesn't play nice with Option Key: SPARK-15810 URL: https://issues.apache.org/jira/browse/SPARK-15810 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-15810) Aggregator doesn't play nice with Option

2016-06-07 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] koert kuipers updated SPARK-15810: -- Description: {noformat} val ds1 = List(("a", 1), ("a", 2), ("a", 3)).toDS val ds2

[jira] [Commented] (SPARK-15780) Support mapValues on KeyValueGroupedDataset

2016-06-07 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15319594#comment-15319594 ] koert kuipers commented on SPARK-15780: --- also see this discussion:

[jira] [Comment Edited] (SPARK-15780) Support mapValues on KeyValueGroupedDataset

2016-06-07 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15319594#comment-15319594 ] koert kuipers edited comment on SPARK-15780 at 6/7/16 10:34 PM: also see

[jira] [Created] (SPARK-15780) Support mapValues on KeyValueGroupedDataset

2016-06-06 Thread koert kuipers (JIRA)
koert kuipers created SPARK-15780: - Summary: Support mapValues on KeyValueGroupedDataset Key: SPARK-15780 URL: https://issues.apache.org/jira/browse/SPARK-15780 Project: Spark Issue Type:

[jira] [Commented] (SPARK-15507) ClassCastException: SomeCaseClass cannot be cast to org.apache.spark.sql.Row

2016-06-06 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317131#comment-15317131 ] koert kuipers commented on SPARK-15507: --- we do not know all the columns, but we do not want to drop

[jira] [Commented] (SPARK-15780) Support mapValues on KeyValueGroupedDataset

2016-06-06 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316943#comment-15316943 ] koert kuipers commented on SPARK-15780: --- original discussion is here:

[jira] [Commented] (SPARK-15507) ClassCastException: SomeCaseClass cannot be cast to org.apache.spark.sql.Row

2016-06-06 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317156#comment-15317156 ] koert kuipers commented on SPARK-15507: --- also soon we will not try to do this in a RDD anymore

[jira] [Commented] (SPARK-15507) ClassCastException: SomeCaseClass cannot be cast to org.apache.spark.sql.Row

2016-06-03 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15315306#comment-15315306 ] koert kuipers commented on SPARK-15507: --- this used to make it very easy to go back and forth

[jira] [Comment Edited] (SPARK-15507) ClassCastException: SomeCaseClass cannot be cast to org.apache.spark.sql.Row

2016-06-03 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15315306#comment-15315306 ] koert kuipers edited comment on SPARK-15507 at 6/4/16 4:20 AM: --- this used

[jira] [Commented] (SPARK-13184) Support minPartitions parameter for JSON and CSV datasources as options

2016-05-25 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301114#comment-15301114 ] koert kuipers commented on SPARK-13184: --- agreed, there should be a general way for data sources to

[jira] [Commented] (SPARK-15598) Change Aggregator.zero to Aggregator.init

2016-05-28 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15305651#comment-15305651 ] koert kuipers commented on SPARK-15598: --- how will change this impact usage like this: {noformat}

[jira] [Comment Edited] (SPARK-15598) Change Aggregator.zero to Aggregator.init

2016-05-28 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15305651#comment-15305651 ] koert kuipers edited comment on SPARK-15598 at 5/28/16 9:54 PM: how would

[jira] [Commented] (SPARK-15034) Use the value of spark.sql.warehouse.dir as the warehouse location instead of using hive.metastore.warehouse.dir

2016-05-26 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302314#comment-15302314 ] koert kuipers commented on SPARK-15034: --- since the system property user.dir is a local filesystem

[jira] [Commented] (SPARK-15575) Remove breeze from dependencies?

2016-05-27 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304135#comment-15304135 ] koert kuipers commented on SPARK-15575: --- we can help out porting breeze to scala 2.12? > Remove

[jira] [Commented] (SPARK-15034) Use the value of spark.sql.warehouse.dir as the warehouse location instead of using hive.metastore.warehouse.dir

2016-05-25 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301508#comment-15301508 ] koert kuipers commented on SPARK-15034: --- this just hit me today. i build spark 2.0.0-SNAPSHOT

[jira] [Commented] (SPARK-15598) Change Aggregator.zero to Aggregator.init

2016-05-26 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303418#comment-15303418 ] koert kuipers commented on SPARK-15598: --- the reason i ask is that if you plan to do: {noformat}

[jira] [Comment Edited] (SPARK-15598) Change Aggregator.zero to Aggregator.init

2016-05-26 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303396#comment-15303396 ] koert kuipers edited comment on SPARK-15598 at 5/27/16 3:22 AM: just to

[jira] [Commented] (SPARK-15598) Change Aggregator.zero to Aggregator.init

2016-05-26 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303375#comment-15303375 ] koert kuipers commented on SPARK-15598: --- this makes a lot of sense, and is consistent with

[jira] [Commented] (SPARK-15598) Change Aggregator.zero to Aggregator.init

2016-05-26 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303396#comment-15303396 ] koert kuipers commented on SPARK-15598: --- just to be clear, your intention is to use it roughly as

[jira] [Created] (SPARK-13246) Avro 1.7.7 Schema.parse race condition hangs task

2016-02-09 Thread koert kuipers (JIRA)
koert kuipers created SPARK-13246: - Summary: Avro 1.7.7 Schema.parse race condition hangs task Key: SPARK-13246 URL: https://issues.apache.org/jira/browse/SPARK-13246 Project: Spark Issue

[jira] [Commented] (SPARK-13246) Avro 1.7.7 Schema.parse race condition hangs task

2016-02-09 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15140290#comment-15140290 ] koert kuipers commented on SPARK-13246: --- it seems this only happens when i build spark with hadoop

[jira] [Commented] (SPARK-13184) Support minPartitions parameter for JSON and CSV datasources as options

2016-02-04 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15132711#comment-15132711 ] koert kuipers commented on SPARK-13184: --- thanks for creating this! i think the issue is to

[jira] [Commented] (SPARK-13531) Some DataFrame joins stopped working with UnsupportedOperationException: No size estimation available for objects

2016-02-28 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171420#comment-15171420 ] koert kuipers commented on SPARK-13531: --- can you try this? {noformat} val df1 = sc.makeRDD(1 to

[jira] [Created] (SPARK-13531) Some DataFrame joins stopped working with UnsupportedOperationException: No size estimation available for objects

2016-02-27 Thread koert kuipers (JIRA)
koert kuipers created SPARK-13531: - Summary: Some DataFrame joins stopped working with UnsupportedOperationException: No size estimation available for objects Key: SPARK-13531 URL:

  1   2   3   4   >