[GitHub] spark pull request: [SPARK-4620] Add unpersist in Graph and GraphI...

2014-11-26 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/3476 [SPARK-4620] Add unpersist in Graph and GraphImpl Add an IF to uncache both vertices and edges of Graph/GraphImpl. This IF is useful when iterative graph operations build a new graph in each

[GitHub] spark pull request: [SPARK-4633] Support GZIPOutputStream in spark...

2014-11-26 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/3488 [SPARK-4633] Support GZIPOutputStream in spark.io.compression.codec gzip is widely used in other frameworks such as hadoop mapreduce and tez, and also I think that gizip is more stable than

[GitHub] spark pull request: [SPARK-4646] Replace Scala.util.Sorting.quickS...

2014-11-30 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/3507#issuecomment-65012572 Ok, I fixed it. If no issue, please merge it. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-4858] Add an option to turn off a progr...

2014-12-15 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/3709 [SPARK-4858] Add an option to turn off a progress bar in spark-shell Add an '--no-progress-bar' option to easily turn off a progress bar in spark-shell for users who'd like to look into debug logs

[GitHub] spark pull request: [SPARK-4858] Add an option to turn off a progr...

2014-12-17 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/3709#issuecomment-67291895 @davies Thx for your comment :) IMHO, it'd be better that switching on/off the bar is independent of log4j log level. This is useful for users to look into some user

[GitHub] spark pull request: [SPARK-4858] Add an option to turn off a progr...

2014-12-17 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/3709#discussion_r21958440 --- Diff: bin/utils.sh --- @@ -32,10 +32,11 @@ function gatherSparkSubmitOpts() { APPLICATION_OPTS=() while (($#)); do case $1

[GitHub] spark pull request: [SPARK-4733] Add missing prameter comments in ...

2014-12-17 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/3594#discussion_r21958573 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -60,6 +60,9 @@ abstract class NarrowDependency[T](_rdd: RDD[T]) extends Dependency[T

[GitHub] spark pull request: [SPARK-4733] Add missing prameter comments in ...

2014-12-17 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/3594#discussion_r21958597 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -60,6 +60,9 @@ abstract class NarrowDependency[T](_rdd: RDD[T]) extends Dependency[T

[GitHub] spark pull request: [SPARK-4633] Support GZIPOutputStream in spark...

2014-12-17 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/3488#issuecomment-67292648 Understood, Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-4733] Add missing prameter comments in ...

2014-12-17 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/3594#issuecomment-67294237 Fixed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-4917] Add a function to convert into a ...

2014-12-22 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/3760 [SPARK-4917] Add a function to convert into a graph with canonical edges in GraphOps Convert bi-directional edges into uni-directional ones instead of 'canonicalOrientation

[GitHub] spark pull request: Add help comments in Analytics

2014-12-22 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/3775 Add help comments in Analytics Trivial modifications for usability. You can merge this pull request into a Git repository by running: $ git pull https://github.com/maropu/spark

[GitHub] spark pull request: [SPARK-4858] Add an option to turn off a progr...

2014-12-23 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/3709#issuecomment-68016680 @andrewor14 OK, I'll re-check it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-4950Delete obsolete mapReduceTripelets u...

2014-12-23 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/3782 SPARK-4950Delete obsolete mapReduceTripelets used in Pregel You can merge this pull request into a Git repository by running: $ git pull https://github.com/maropu/spark

[GitHub] spark pull request: [SPARK-4950] Delete obsolete mapReduceTripelet...

2014-12-23 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/3782#issuecomment-68026190 Any reason not to replace the api along with SPARK-3936? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-4858] Add an option to turn off a progr...

2014-12-24 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/3709#issuecomment-68050965 Fixed, please test it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-4970Fix an implicit bug in SparkSubmitSu...

2014-12-25 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/3805 SPARK-4970Fix an implicit bug in SparkSubmitSuite You can merge this pull request into a Git repository by running: $ git pull https://github.com/maropu/spark SparkSubmitBugFix Alternatively

[GitHub] spark pull request: [SPARK-4970] Fix an implicit bug in SparkSubmi...

2014-12-25 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/3805#issuecomment-68117570 The test 'includes jars passed in through --jars’ in SparkSubmitSuite fails when spark.executor.memory is set at over 512MiB in conf/spark-default.conf

[GitHub] spark pull request: [SPARK-4950] Delete obsolete mapReduceTripelet...

2014-12-25 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/3782#issuecomment-68118211 Understood. I got back to the old Pregel API. And also, I'll check #1217 later :)) --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-4970] Do not read spark.executor.memory...

2014-12-29 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/3805#issuecomment-68298602 @JoshRosen Many thanks for your comments and sorry to bother you :(( The PR/JIRA title and the detailed message in PR are fixed along with your comments. If any

[GitHub] spark pull request: [SPARK-4970] Do not read spark.executor.memory...

2014-12-29 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/3805#discussion_r22329786 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -209,7 +209,7 @@ object SparkSubmit { OptionAssigner(args.jars, YARN

[GitHub] spark pull request: [SPARK-5380][GraphX] Solve an ArrayIndexOutOfB...

2015-02-03 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/4176#issuecomment-72786951 I wonder if stopping the process is the best solution. If there is only one illegal entry in a last line, we need to re-try loading a whole file, which is time

[GitHub] spark pull request: [SPARK-2827][GraphX]Add degree distribution op...

2015-02-03 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/1767#issuecomment-72782993 What't the status of this patch? If possibly merged into the master, I'll refactor the codes and add unit tests. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-5484] Checkpoint every 25 iterations in...

2015-02-03 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/4273#issuecomment-72790567 How about adding a new configuration, e.g., spark.graphx.pregel.checkpoint.interval in SparkConf? --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-2827][GraphX]Add degree distribution op...

2015-02-03 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/1767#issuecomment-72804365 Ok, I'll take it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-5484] Checkpoint every 25 iterations in...

2015-02-03 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/4273#issuecomment-72804614 And also, this issue seems to be related to SPARK-5561. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-5623][GraphX] Replace an obsolete mapRe...

2015-02-05 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/4402 [SPARK-5623][GraphX] Replace an obsolete mapReduceTriplets with a new aggregateMessages in GraphSuite You can merge this pull request into a Git repository by running: $ git pull https

[GitHub] spark pull request: [SPARK-2827][GraphX] Add collectDegreeDist to ...

2015-02-05 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/4399 [SPARK-2827][GraphX] Add collectDegreeDist to compute the distribute of vertex degrees in GraphOps Add degree distribution operators in GraphOps for GraphX. You can merge this pull request

[GitHub] spark pull request: [SPARK-2827][GraphX] Add collectDegreeDist to ...

2015-02-05 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/4399#issuecomment-73077087 Refactored #1767 and added unit tests. I made the original patch simpler since this is a first patch to add a new API. If necessary, the other part will be add

[GitHub] spark pull request: [SPARK-2827][GraphX]Add degree distribution op...

2015-02-05 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/1767#issuecomment-73077238 The new PR done, and plz review it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-5352][GraphX] Add getPartitionStrategy ...

2015-01-21 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/4138 [SPARK-5352][GraphX] Add getPartitionStrategy in Graph Graph remembers an applied partition strategy in partitionBy() and returns it via getPartitionStrategy(). This is useful in case

[GitHub] spark pull request: [SPARK-5351][GraphX] Do not use Partitioner.de...

2015-01-21 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/4136 [SPARK-5351][GraphX] Do not use Partitioner.defaultPartitioner as a partitioner of EdgeRDDImp... If the value of 'spark.default.parallelism' does not match the number of partitoins in EdgePartition

[GitHub] spark pull request: [SPARK-5827][SQL] Add missing import in the ex...

2015-02-15 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/4615 [SPARK-5827][SQL] Add missing import in the example of SqlContext If one tries an example by using copypaste, throw an exception. You can merge this pull request into a Git repository by running

[GitHub] spark pull request: [SPARK-5827][SQL] Add missing import in the ex...

2015-02-15 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/4615#issuecomment-74420758 OK, thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: SPARK-5815 [MLLIB] Deprecate SVDPlusPlus APIs ...

2015-02-15 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/4614#issuecomment-74420544 Understood. I was thinking that not dummy arguments but the duplicated 'Conf' class in SVDPlusPlus are used so as to solve the issue; /** Obsolete

[GitHub] spark pull request: SPARK-5815 [MLLIB] Deprecate SVDPlusPlus APIs ...

2015-02-15 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/4614#issuecomment-74419961 Look good though, it would be better to use SVDPlusPlus.run() as the name of an entry point for usability because other libraries such TriangleCount and PageRan do so

[GitHub] spark pull request: [GraphX] Add a test for GraphLoader.edgeListFi...

2015-02-18 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/4674#issuecomment-74879829 oops, I missed it. I'll fix it soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-5882][GraphX] Add a test for GraphLoade...

2015-02-18 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/4674#issuecomment-74880353 Fixed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [GraphXAdd a test for GraphLoader.edgeListFile

2015-02-18 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/4674 [GraphXAdd a test for GraphLoader.edgeListFile You can merge this pull request into a Git repository by running: $ git pull https://github.com/maropu/spark AddGraphLoaderSuite Alternatively

[GitHub] spark pull request: [SPARK 5280] RDF Loader added + documentation

2015-02-18 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/4650#issuecomment-74879414 Please add tests for your RDF loader, and see my codes as an example: https://github.com/maropu/spark/commit/cc5ac0b08ca39c3c339fdca905779bb3b037f8fa BTW, I

[GitHub] spark pull request: [SPARK-5450][GraphX] Add APIs to save a graph ...

2015-01-28 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/4244 [SPARK-5450][GraphX] Add APIs to save a graph as a SequenceFile and load it As the size of input data increases, building Graph eat much processing time via GraphLoader.edgeListFile() or RDD

[GitHub] spark pull request: [SPARK-5351][GraphX] Do not use Partitioner.de...

2015-01-24 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/4136#issuecomment-71305053 Thanks for your quick commits! I got trouble in this bug for my graphx application :)) --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-5352][GraphX] Add getPartitionStrategy ...

2015-01-24 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/4138#issuecomment-71305134 ISTM this patch causes no error, so please re-test it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-4970] Do not read spark.executor.memory...

2015-01-10 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/3805#issuecomment-69449925 ok, understood. thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-4970] Do not read spark.executor.memory...

2015-01-10 Thread maropu
Github user maropu closed the pull request at: https://github.com/apache/spark/pull/3805 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-4858] Add an option to turn off a progr...

2015-01-10 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/3709#issuecomment-69450446 Ok. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-4858] Add an option to turn off a progr...

2015-01-10 Thread maropu
Github user maropu closed the pull request at: https://github.com/apache/spark/pull/3709 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-4970] Do not read spark.executor.memory...

2015-01-09 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/3805#issuecomment-69310417 I looked over this issue, however, we couldn't find the best way to fix it and I'm not sure that the change of this patch is the best. This patch just skips

[GitHub] spark pull request: [SPARK-6379][SQL] Support an implicit conversi...

2015-03-18 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/5061#issuecomment-82888774 @rxin fixed, and plz check it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-6357][GraphX] Add unapply in EdgeContex...

2015-03-16 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/5047 [SPARK-6357][GraphX] Add unapply in EdgeContext This extractor is mainly used for Graph#aggregateMessages*. You can merge this pull request into a Git repository by running: $ git pull https

[GitHub] spark pull request: [SPARK-6281][GraphX] Support incremental updat...

2015-03-17 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/5067 [SPARK-6281][GraphX] Support incremental updates for an existing graph This patch is an initial one, that is, it has some inefficient codes. I'll make the codes more efficient step-by-step

[GitHub] spark pull request: [SPARK-6379][SQL] Support an implicit conversi...

2015-03-16 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/5061 [SPARK-6379][SQL] Support an implicit conversion from udfname to an UDF defined in SQLContext This is useful for using pre-defined UDFs in SQLContext; val df = Seq((id1, 1), (id2, 4), (id3

[GitHub] spark pull request: [SPARK-6379][SQL] Support an implicit conversi...

2015-03-17 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/5061#issuecomment-82214695 You mean df.select($id, callUDF($value))? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-6379][SQL] Support an implicit conversi...

2015-03-17 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/5061#issuecomment-82673585 Ok, I'll try to refine my codes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-6424][SQL] Add an user-defined aggregat...

2015-03-19 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/5100#issuecomment-83922236 Thanks for your comments :)) I'm interested in your work, so I'll review them later. Also, I'll fix the description. --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-6424][SQL] Add an user-defined aggregat...

2015-03-19 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/5100 [SPARK-6424][SQL] Add an user-defined aggregator in AggregateExpression Currently, only user-defined generators (UDTF in Hive) are supported in native SparkSQL. This enables third-parties

[GitHub] spark pull request: [SPARK-4233] [SQL] WIP:Simplify the UDAF API (...

2015-03-20 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/3247#issuecomment-83988487 Can you rebase the patch? I can't merge it into master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-4086][GraphX]: Fold-style aggregation f...

2015-03-24 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/5142#issuecomment-85787944 lgtm except for the naming issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-6379][SQL] Support a functon to call us...

2015-03-22 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/5061#issuecomment-84779025 The description updated and the patch fixed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: Add a function that can build an EdgePartition...

2015-03-02 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/792#issuecomment-76870286 @ankurdave If this patch possibly merged, I'll refactor it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-1955][GraphX]: VertexRDD can incorrectl...

2015-02-25 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/4705#issuecomment-75930751 @brennonyork I'll add unit tests for your patch before/after the patch merged. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-6510][GraphX]: Add Graph#minus method t...

2015-03-26 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/5175#issuecomment-86779685 lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-4086][GraphX]: Fold-style aggregation f...

2015-03-24 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/5142#discussion_r27005643 --- Diff: graphx/src/main/scala/org/apache/spark/graphx/VertexRDD.scala --- @@ -154,7 +154,30 @@ abstract class VertexRDD[VD]( * @return a VertexRDD

[GitHub] spark pull request: [SPARK-4086][GraphX]: Fold-style aggregation f...

2015-03-24 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/5142#discussion_r27005945 --- Diff: graphx/src/main/scala/org/apache/spark/graphx/impl/VertexPartitionBaseOps.scala --- @@ -136,6 +136,31 @@ private[graphx] abstract class

[GitHub] spark pull request: [SPARK-4086][GraphX]: Fold-style aggregation f...

2015-03-24 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/5142#discussion_r27005726 --- Diff: graphx/src/main/scala/org/apache/spark/graphx/VertexRDD.scala --- @@ -154,7 +154,30 @@ abstract class VertexRDD[VD]( * @return a VertexRDD

[GitHub] spark pull request: [SPARK-6379][SQL] Support a functon to call us...

2015-03-26 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/5061#issuecomment-86361279 I'll fix soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-6379][SQL] Support a functon to call us...

2015-03-26 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/5061#discussion_r27191782 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala --- @@ -212,6 +212,22 @@ class SQLContext(@transient val sparkContext: SparkContext

[GitHub] spark pull request: [SPARK-4233] [SQL] WIP:Simplify the UDAF API (...

2015-04-01 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/3247#issuecomment-88711084 @chenghao-intel I'm also with your refactoring idea though, it's too big to merge into the master in bulk. ISTM this patch is better to split into some small ones

[GitHub] spark pull request: [SPARK-4233] [SQL] WIP:Simplify the UDAF API (...

2015-04-02 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/3247#issuecomment-88771801 Is it not possible to create that simple patch that removes DISTINCT aggregation expressions? We only add `distinct` as a field value in `AggregateExpresion

[GitHub] spark pull request: [SPARK-4233] [SQL] WIP:Simplify the UDAF API (...

2015-04-03 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/3247#discussion_r27716819 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/Row.scala --- @@ -305,6 +305,49 @@ trait Row extends Serializable { */ def getAs[T](i

[GitHub] spark pull request: [SPARK-4233] [SQL] WIP:Simplify the UDAF API (...

2015-04-03 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/3247#discussion_r27717364 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregates.scala --- @@ -17,285 +17,159 @@ package

[GitHub] spark pull request: [SPARK-4233] [SQL] WIP:Simplify the UDAF API (...

2015-04-03 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/3247#discussion_r27717124 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregates.scala --- @@ -17,285 +17,159 @@ package

[GitHub] spark pull request: [SPARK-4233] [SQL] WIP:Simplify the UDAF API (...

2015-04-03 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/3247#discussion_r27716889 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/rows.scala --- @@ -37,6 +35,21 @@ trait MutableRow extends Row { def

[GitHub] spark pull request: [SPARK-4233] [SQL] WIP:Simplify the UDAF API (...

2015-04-03 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/3247#discussion_r27717014 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala --- @@ -26,9 +26,12 @@ trait FunctionRegistry

[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-04-03 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/5178#discussion_r27719488 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala --- @@ -223,6 +226,15 @@ class BlockManagerMasterActor(val isLocal

[GitHub] spark pull request: [SPARK-4233] [SQL] WIP:Simplify the UDAF API (...

2015-04-03 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/3247#discussion_r27717257 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregates.scala --- @@ -17,285 +17,159 @@ package

[GitHub] spark pull request: [SPARK-4233] [SQL] WIP:Simplify the UDAF API (...

2015-04-03 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/3247#discussion_r27717430 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/Aggregate.scala --- @@ -17,181 +17,461 @@ package org.apache.spark.sql.execution

[GitHub] spark pull request: [SPARK-4233] [SQL] WIP:Simplify the UDAF API (...

2015-04-03 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/3247#discussion_r27717539 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala --- @@ -26,9 +26,12 @@ trait FunctionRegistry

[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-04-03 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/5178#issuecomment-89318869 One question; are there many cases for executors to share a single host in the Yarn mode? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-6747][SQL] Support List as a return t...

2015-04-11 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/5395#issuecomment-91777942 ISTM hive supports list as a return type (see the links below). Also, some thrid-party libraries use it. https://github.com/kyluka/hive/blob/master/ql/src

[GitHub] spark pull request: [SPARK-6747][SQL] Support List as a return t...

2015-04-26 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/5395#issuecomment-96462276 cc @marmbrus Could you merge into master? I'll make a PR of SPARK-6912, but it depends on this. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-6747][SQL] Support List as a return t...

2015-05-06 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/5395#issuecomment-99679825 cc @marmbrus just a reminder --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-6747][SQL] Support List as a return t...

2015-05-12 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/5395#issuecomment-101162107 Oh, sorry. I'll fix it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-6747][SQL] Support List as a return t...

2015-05-15 Thread maropu
Github user maropu closed the pull request at: https://github.com/apache/spark/pull/5395 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-6747][SQL] Support List as a return t...

2015-05-15 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/5395#issuecomment-102276959 @marmbrus Made a mistake to close this pr, so may I make a new pr because I can't re-open it. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-6747] [SQL] Support List as a return ...

2015-05-15 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/6179#issuecomment-102298759 This is a re-open pr because I made a mistake to close #5395. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-6747] [SQL] Support List as a return ...

2015-05-15 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/6179 [SPARK-6747] [SQL] Support List as a return type in Hive UDF This patch supports List as a return type in Hive UDF. We assume an UDF below; public class UDFToListString extends UDF

[GitHub] spark pull request: [SPARK-6747] [SQL] Support List as a return ...

2015-05-15 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/6179#discussion_r30457828 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala --- @@ -214,8 +217,16 @@ private[hive] trait HiveInspectors

[GitHub] spark pull request: [SPARK-6747][SQL] Support List as a return t...

2015-04-14 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/5395#issuecomment-92981451 Sorry for the delay. Fixed and plz re-check them. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-6747][SQL] Support List as a return t...

2015-04-14 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/5395#discussion_r28346669 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala --- @@ -213,8 +215,16 @@ private[hive] trait HiveInspectors

[GitHub] spark pull request: [SPARK-6747][SQL] Support List as a return t...

2015-04-14 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/5395#discussion_r28346473 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala --- @@ -213,8 +215,16 @@ private[hive] trait HiveInspectors

[GitHub] spark pull request: [SPARK-5352][GraphX] Add getPartitionStrategy ...

2015-04-17 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/5549 [SPARK-5352][GraphX] Add getPartitionStrategy in Graph Graph remembers an applied partition strategy in partitionBy() and returns it via getPartitionStrategy(). This is useful in case

[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-04-17 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/5178#issuecomment-93917004 @viper-kun What's the status of this patch? If you don't make further updates, I'd like to brush up this patch. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-5352][GraphX] Add getPartitionStrategy ...

2015-04-17 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/4138#issuecomment-93893277 Sorry but mistook to close, so re-make the PR. https://github.com/apache/spark/pull/5549 --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-5623][GraphX] Replace an obsolete mapRe...

2015-04-17 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/4402#issuecomment-93892969 ok, fixed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-5352][GraphX] Add getPartitionStrategy ...

2015-04-16 Thread maropu
Github user maropu closed the pull request at: https://github.com/apache/spark/pull/4138 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-6747][SQL] Support List as a return t...

2015-04-16 Thread maropu
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/5395#issuecomment-93870531 Missed and fixed. This fix satisfies your point? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-4233] [SQL] WIP:Simplify the UDAF API (...

2015-04-05 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/3247#discussion_r27782004 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregates.scala --- @@ -17,285 +17,159 @@ package

[GitHub] spark pull request: [SPARK-4233] [SQL] WIP:Simplify the UDAF API (...

2015-04-05 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/3247#discussion_r27781998 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregates.scala --- @@ -17,285 +17,159 @@ package

[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-04-05 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/5178#discussion_r27781900 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -181,68 +181,102 @@ final class ShuffleBlockFetcherIterator

  1   2   3   4   5   6   7   8   9   10   >