--jars does not take remote jar?

2017-05-02 Thread Nan Zhu
Hi, all For some reason, I tried to pass in a HDFS path to the --jars option in spark-submit According to the document, http://spark.apache.org/docs/latest/submitting-applications.html#advanced-dependency-management, --jars would accept remote path However, in the implementation,

Re: Azure Event Hub with Pyspark

2017-04-20 Thread Nan Zhu
DocDB does have a java client? Anything prevent you using that? Get Outlook for iOS From: ayan guha Sent: Thursday, April 20, 2017 9:24:03 PM To: Ashish Singh Cc: user Subject: Re: Azure Event Hub with Pyspark Hi yes,

[jira] [Commented] (SPARK-20251) Spark streaming skips batches in a case of failure

2017-04-20 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977589#comment-15977589 ] Nan Zhu commented on SPARK-20251: - ignore my previous comments...the moving on Spark Streaming is due

[jira] [Comment Edited] (SPARK-20251) Spark streaming skips batches in a case of failure

2017-04-09 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962318#comment-15962318 ] Nan Zhu edited comment on SPARK-20251 at 4/10/17 12:16 AM: --- more details here

[jira] [Commented] (SPARK-20251) Spark streaming skips batches in a case of failure

2017-04-09 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962318#comment-15962318 ] Nan Zhu commented on SPARK-20251: - more details here, by "be proceeding", I mean it i

[jira] [Comment Edited] (SPARK-20251) Spark streaming skips batches in a case of failure

2017-04-09 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962313#comment-15962313 ] Nan Zhu edited comment on SPARK-20251 at 4/9/17 11:57 PM: -- why

[jira] [Commented] (SPARK-20251) Spark streaming skips batches in a case of failure

2017-04-09 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962313#comment-15962313 ] Nan Zhu commented on SPARK-20251: - why this is an invalid report? I have been observing the same behavior

Re: Outstanding Spark 2.1.1 issues

2017-03-20 Thread Nan Zhu
I think https://issues.apache.org/jira/browse/SPARK-19280 should be a blocker Best, Nan On Mon, Mar 20, 2017 at 8:18 PM, Felix Cheung wrote: > I've been scrubbing R and think we are tracking 2 issues > > https://issues.apache.org/jira/browse/SPARK-19237 > >

[jira] [Commented] (SPARK-19789) Add the shortcut of .format("parquet").option("path", "/hdfs/path").partitionBy("col1", "col2").start()

2017-03-12 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906782#comment-15906782 ] Nan Zhu commented on SPARK-19789: - [~zsxwing] mind reviewing the PR? > Add the shortcut of .for

Re: [Vote] New MXNet Logo

2017-03-04 Thread Nan Zhu
+1 Get Outlook for iOS From: Joseph Spisak Sent: Saturday, March 4, 2017 9:07:54 PM To: d...@mxnet.apache.org Subject: [Vote] New MXNet Logo [cid:9B78EEE7-84AC-4010-A7A6-DF7A4C196DCF@hsd1.ca.comcast.net.] Let's vote

[jira] [Created] (SPARK-19789) Add the shortcut of .format("parquet").option("path", "/hdfs/path").partitionBy("col1", "col2").start()

2017-03-01 Thread Nan Zhu (JIRA)
Nan Zhu created SPARK-19789: --- Summary: Add the shortcut of .format("parquet").option("path", "/hdfs/path").partitionBy("col1", "col2").start() Key: SPARK-19789 URL:

[jira] [Updated] (SPARK-19788) DataStreamReader/DataStreamWriter.option shall accept user-defined type

2017-03-01 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nan Zhu updated SPARK-19788: Description: There are many other data sources/sinks which has very different configuration ways than

[jira] [Updated] (SPARK-19788) DataStreamReader/DataStreamWriter.option shall accept user-defined type

2017-03-01 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nan Zhu updated SPARK-19788: Description: There are many other data sources/sinks which has very different configuration ways than

[jira] [Comment Edited] (SPARK-19788) DataStreamReader/DataStreamWriter.option shall accept user-defined type

2017-03-01 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890522#comment-15890522 ] Nan Zhu edited comment on SPARK-19788 at 3/1/17 4:45 PM: - another drawback

[jira] [Updated] (SPARK-19788) DataStreamReader/DataStreamWriter.option shall accept user-defined type

2017-03-01 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nan Zhu updated SPARK-19788: Summary: DataStreamReader/DataStreamWriter.option shall accept user-defined type

[jira] [Commented] (SPARK-19788) DataStreamReader.option shall accept user-defined type

2017-03-01 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890522#comment-15890522 ] Nan Zhu commented on SPARK-19788: - another drawback is that it might look like incompatible

[jira] [Created] (SPARK-19788) DataStreamReader.option shall accept user-defined type

2017-03-01 Thread Nan Zhu (JIRA)
Nan Zhu created SPARK-19788: --- Summary: DataStreamReader.option shall accept user-defined type Key: SPARK-19788 URL: https://issues.apache.org/jira/browse/SPARK-19788 Project: Spark Issue Type

[jira] [Commented] (SPARK-19280) Failed Recovery from checkpoint caused by the multi-threads issue in Spark Streaming scheduler

2017-02-27 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886999#comment-15886999 ] Nan Zhu commented on SPARK-19280: - [~zsxwing] please let me know if we agree on that 2 is something we

Re: New Logo?

2017-02-21 Thread Nan Zhu
I assume that this vote is to decide *whether* we need a new logo? not specifically to one of the designs in the original vote under DMLC repo? On Tue, Feb 21, 2017 at 8:26 AM, Nan Zhu <zhunanmcg...@gmail.com> wrote: > +1 > > On Mon, Feb 20, 2017 at 11:57 PM, Henri Yandell <

[jira] [Updated] (SPARK-19499) Add more notes in the comments of Sink.addBatch()

2017-02-07 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nan Zhu updated SPARK-19499: Description: addBatch method in Sink trait is supposed to be a synchronous method to coordinate

[jira] [Updated] (SPARK-19499) Add more notes in the comments of Sink.addBatch()

2017-02-07 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nan Zhu updated SPARK-19499: Summary: Add more notes in the comments of Sink.addBatch() (was: Add more description in the comments

[jira] [Created] (SPARK-19499) Add more description in the comments of Sink.addBatch()

2017-02-07 Thread Nan Zhu (JIRA)
Nan Zhu created SPARK-19499: --- Summary: Add more description in the comments of Sink.addBatch() Key: SPARK-19499 URL: https://issues.apache.org/jira/browse/SPARK-19499 Project: Spark Issue Type

[jira] [Commented] (SPARK-19233) Inconsistent Behaviour of Spark Streaming Checkpoint

2017-02-03 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851787#comment-15851787 ] Nan Zhu commented on SPARK-19233: - ping > Inconsistent Behaviour of Spark Streaming Checkpo

[jira] [Commented] (SPARK-19280) Failed Recovery from checkpoint caused by the multi-threads issue in Spark Streaming scheduler

2017-02-03 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851786#comment-15851786 ] Nan Zhu commented on SPARK-19280: - ping > Failed Recovery from checkpoint caused by the multi-thre

[jira] (SPARK-19233) Inconsistent Behaviour of Spark Streaming Checkpoint

2017-01-29 Thread Nan Zhu (JIRA)
Title: Message Title Nan Zhu commented on SPARK-19233

[jira] (SPARK-19280) Failed Recovery from checkpoint caused by the multi-threads issue in Spark Streaming scheduler

2017-01-29 Thread Nan Zhu (JIRA)
Title: Message Title Nan Zhu commented on SPARK-19280

[jira] [Created] (SPARK-19358) LiveListenerBus shall log the event name when dropping them due to a fully filled queue

2017-01-24 Thread Nan Zhu (JIRA)
Nan Zhu created SPARK-19358: --- Summary: LiveListenerBus shall log the event name when dropping them due to a fully filled queue Key: SPARK-19358 URL: https://issues.apache.org/jira/browse/SPARK-19358

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Nan Zhu
Congratulations! On Tue, Jan 24, 2017 at 4:50 PM, Hyukjin Kwon wrote: > Congratuation!! > > 2017-01-25 9:22 GMT+09:00 Takeshi Yamamuro : > >> Congrats! >> >> // maropu >> >> On Wed, Jan 25, 2017 at 9:20 AM, Kousuke Saruta < >>

[jira] [Comment Edited] (SPARK-19280) Failed Recovery from checkpoint caused by the multi-threads issue in Spark Streaming scheduler

2017-01-20 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831209#comment-15831209 ] Nan Zhu edited comment on SPARK-19280 at 1/20/17 1:24 PM: -- [~zsxwing] Thanks

[jira] [Commented] (SPARK-19280) Failed Recovery from checkpoint caused by the multi-threads issue in Spark Streaming scheduler

2017-01-19 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831217#comment-15831217 ] Nan Zhu commented on SPARK-19280: - BTW, do I need to highlight the KafkaDStream issue as another JIRA

[jira] [Commented] (SPARK-19280) Failed Recovery from checkpoint caused by the multi-threads issue in Spark Streaming scheduler

2017-01-19 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831209#comment-15831209 ] Nan Zhu commented on SPARK-19280: - [~zsxwing] Thanks for reply 0) I do not think the content

[jira] [Updated] (SPARK-19280) Failed Recovery from checkpoint caused by the multi-threads issue in Spark Streaming scheduler

2017-01-19 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nan Zhu updated SPARK-19280: Description: In one of our applications, we found the following issue, the application recovering from

[jira] [Commented] (SPARK-19233) Inconsistent Behaviour of Spark Streaming Checkpoint

2017-01-19 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831098#comment-15831098 ] Nan Zhu commented on SPARK-19233: - By filtering generatedRDDs, I may bring some confusion here, what I

[jira] [Updated] (SPARK-19280) Failed Recovery from checkpoint caused by the multi-threads issue in Spark Streaming scheduler

2017-01-18 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nan Zhu updated SPARK-19280: Description: In one of our applications, we found the following issue, the application recovering from

[jira] [Commented] (SPARK-19278) Failed Recovery from checkpoint caused by the multi-threads issue in Spark Streaming scheduler

2017-01-18 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828621#comment-15828621 ] Nan Zhu commented on SPARK-19278: - any one would help to close this one? as it is a duplication of 19280

[jira] [Commented] (SPARK-19280) Failed Recovery from checkpoint caused by the multi-threads issue in Spark Streaming scheduler

2017-01-18 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828602#comment-15828602 ] Nan Zhu commented on SPARK-19280: - [~zsxwing] would you mind confirming about this? it would be great

[jira] [Created] (SPARK-19280) Failed Recovery from checkpoint caused by the multi-threads issue in Spark Streaming scheduler

2017-01-18 Thread Nan Zhu (JIRA)
Nan Zhu created SPARK-19280: --- Summary: Failed Recovery from checkpoint caused by the multi-threads issue in Spark Streaming scheduler Key: SPARK-19280 URL: https://issues.apache.org/jira/browse/SPARK-19280

[jira] [Created] (SPARK-19278) Failed Recovery from checkpoint caused by the multi-threads issue in Spark Streaming scheduler

2017-01-18 Thread Nan Zhu (JIRA)
Nan Zhu created SPARK-19278: --- Summary: Failed Recovery from checkpoint caused by the multi-threads issue in Spark Streaming scheduler Key: SPARK-19278 URL: https://issues.apache.org/jira/browse/SPARK-19278

[jira] [Commented] (SPARK-19233) Inconsistent Behaviour of Spark Streaming Checkpoint

2017-01-15 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823364#comment-15823364 ] Nan Zhu commented on SPARK-19233: - [~zsxwing] so, another potential issue I found in Spark Streaming

[jira] [Commented] (SPARK-19233) Inconsistent Behaviour of Spark Streaming Checkpoint

2017-01-15 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823359#comment-15823359 ] Nan Zhu commented on SPARK-19233: - The category of this issue is Improvement which is subject

[jira] [Created] (SPARK-19233) Inconsistent Behaviour of Spark Streaming Checkpoint

2017-01-15 Thread Nan Zhu (JIRA)
Nan Zhu created SPARK-19233: --- Summary: Inconsistent Behaviour of Spark Streaming Checkpoint Key: SPARK-19233 URL: https://issues.apache.org/jira/browse/SPARK-19233 Project: Spark Issue Type

[jira] [Commented] (SPARK-18905) Potential Issue of Semantics of BatchCompleted

2017-01-09 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813632#comment-15813632 ] Nan Zhu commented on SPARK-18905: - [~zsxwing] If you agree on the conclusion above, I will file a PR

[jira] [Commented] (SPARK-18905) Potential Issue of Semantics of BatchCompleted

2017-01-09 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813560#comment-15813560 ] Nan Zhu commented on SPARK-18905: - eat my words... when we have queued up batches, we do need

[jira] [Comment Edited] (SPARK-18905) Potential Issue of Semantics of BatchCompleted

2017-01-09 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813459#comment-15813459 ] Nan Zhu edited comment on SPARK-18905 at 1/10/17 1:16 AM: -- yeah

[jira] [Commented] (SPARK-18905) Potential Issue of Semantics of BatchCompleted

2017-01-09 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813459#comment-15813459 ] Nan Zhu commented on SPARK-18905: - yeah, but the downTime including all batches from "checkpoint

[jira] [Comment Edited] (SPARK-18905) Potential Issue of Semantics of BatchCompleted

2017-01-09 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813434#comment-15813434 ] Nan Zhu edited comment on SPARK-18905 at 1/10/17 1:05 AM: -- Hi, [~zsxwing

[jira] [Commented] (SPARK-18905) Potential Issue of Semantics of BatchCompleted

2017-01-09 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813434#comment-15813434 ] Nan Zhu commented on SPARK-18905: - Hi, [~zsxwing] Thanks for the reply, After testing in our

[jira] [Updated] (SPARK-18905) Potential Issue of Semantics of BatchCompleted

2016-12-16 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nan Zhu updated SPARK-18905: Description: the current implementation of Spark streaming considers a batch is completed no matter

[jira] [Updated] (SPARK-18905) Potential Issue of Semantics of BatchCompleted

2016-12-16 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nan Zhu updated SPARK-18905: Description: the current implementation of Spark streaming considers a batch is completed no matter

[jira] [Updated] (SPARK-18905) Potential Issue of Semantics of BatchCompleted

2016-12-16 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nan Zhu updated SPARK-18905: Description: the current implementation of Spark streaming considers a batch is completed no matter

[jira] [Created] (SPARK-18905) Potential Issue of Semantics of BatchCompleted

2016-12-16 Thread Nan Zhu (JIRA)
Nan Zhu created SPARK-18905: --- Summary: Potential Issue of Semantics of BatchCompleted Key: SPARK-18905 URL: https://issues.apache.org/jira/browse/SPARK-18905 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-17347) Encoder in Dataset example has incorrect type

2016-08-31 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nan Zhu updated SPARK-17347: Summary: Encoder in Dataset example has incorrect type (was: Encoder in Dataset example is incorrect

[jira] [Created] (SPARK-17347) Encoder in Dataset example is incorrect on type

2016-08-31 Thread Nan Zhu (JIRA)
Nan Zhu created SPARK-17347: --- Summary: Encoder in Dataset example is incorrect on type Key: SPARK-17347 URL: https://issues.apache.org/jira/browse/SPARK-17347 Project: Spark Issue Type: Bug

Re: Welcoming Yanbo Liang as a committer

2016-06-03 Thread Nan Zhu
Congratulations ! --  Nan Zhu On June 3, 2016 at 10:50:33 PM, Ted Yu (yuzhih...@gmail.com) wrote: Congratulations, Yanbo. On Fri, Jun 3, 2016 at 7:48 PM, Matei Zaharia <matei.zaha...@gmail.com> wrote: Hi all, The PMC recently voted to add Yanbo Liang as a committer. Yanbo has been a

[jira] [Closed] (SPARK-14247) Spark does not compile with CDH-5.4.x due to the possible bug of ivy.....

2016-03-29 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nan Zhu closed SPARK-14247. --- Resolution: Not A Problem > Spark does not compile with CDH-5.4.x due to the possible bug of

[jira] [Comment Edited] (SPARK-14247) Spark does not compile with CDH-5.4.x due to the possible bug of ivy.....

2016-03-29 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216719#comment-15216719 ] Nan Zhu edited comment on SPARK-14247 at 3/29/16 7:39 PM: -- thanks [~sowen

[jira] [Commented] (SPARK-14247) Spark does not compile with CDH-5.4.x due to the possible bug of ivy.....

2016-03-29 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216719#comment-15216719 ] Nan Zhu commented on SPARK-14247: - thanks [~sowen], it seems that change the hadoop.version name solves

[jira] [Updated] (SPARK-14247) Spark does not compile with CDH-5.4.x due to the possible bug of ivy.....

2016-03-29 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nan Zhu updated SPARK-14247: Priority: Minor (was: Major) > Spark does not compile with CDH-5.4.x due to the possible bug of

[jira] [Commented] (SPARK-14247) Spark does not compile with CDH-5.4.x due to the possible bug of ivy.....

2016-03-29 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216661#comment-15216661 ] Nan Zhu commented on SPARK-14247: - [~srowen] I always blindly copied "CDH.*" string from Spar

[jira] [Created] (SPARK-14247) Spark does not compile with CDH-5.4.x due to the possible bug of ivy.....

2016-03-29 Thread Nan Zhu (JIRA)
Nan Zhu created SPARK-14247: --- Summary: Spark does not compile with CDH-5.4.x due to the possible bug of ivy. Key: SPARK-14247 URL: https://issues.apache.org/jira/browse/SPARK-14247 Project: Spark

[Package Release] Widely accepted XGBoost now available in Spark

2016-03-16 Thread Nan Zhu
are more than welcome to join us and contribute to the project! For more details of distributed XGBoost, you can refer to the recently published paper: http://arxiv.org/abs/1603.02754 Best, -- Nan Zhu http://codingcat.me

[jira] [Commented] (SPARK-8547) xgboost exploration

2016-03-15 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195682#comment-15195682 ] Nan Zhu commented on SPARK-8547: FYI, we released a solution to integrate XGBoost with Spark directly

[jira] [Commented] (SPARK-13868) Random forest accuracy exploration

2016-03-15 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195686#comment-15195686 ] Nan Zhu commented on SPARK-13868: - FYI, we released a solution to integrate XGBoost with Spark directly

Release Announcement: XGBoost4J - Portable Distributed XGBoost in Spark, Flink and Dataflow

2016-03-15 Thread Nan Zhu
! For more details of distributed XGBoost, you can refer to the recently published paper: http://arxiv.org/abs/1603.02754 Best, -- Nan Zhu http://codingcat.me

Release Announcement: XGBoost4J - Portable Distributed XGBoost in Spark, Flink and Dataflow

2016-03-15 Thread Nan Zhu
! For more details of distributed XGBoost, you can refer to the recently published paper: http://arxiv.org/abs/1603.02754 Best, -- Nan Zhu http://codingcat.me

Re: Failing MiMa tests

2016-03-14 Thread Nan Zhu
I guess it’s Jenkins’ problem? My PR was failed for MiMa but still got a message from SparkQA (https://github.com/SparkQA) saying that "This patch passes all tests." I checked Jenkins’ history, there are other PRs with the same issue…. Best, -- Nan Zhu http://codingcat.me

[jira] [Created] (SPARK-13227) Risky apply() in OpenHashMap

2016-02-06 Thread Nan Zhu (JIRA)
Nan Zhu created SPARK-13227: --- Summary: Risky apply() in OpenHashMap Key: SPARK-13227 URL: https://issues.apache.org/jira/browse/SPARK-13227 Project: Spark Issue Type: Bug Components

[jira] [Commented] (SPARK-12786) Actor demo does not demonstrate usable code

2016-01-13 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096187#comment-15096187 ] Nan Zhu commented on SPARK-12786: - the only place it relies on AkkaUtil is to create an ActorSystem

[jira] [Commented] (SPARK-12713) UI Executor page should keep links around to executors that died

2016-01-08 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15089884#comment-15089884 ] Nan Zhu commented on SPARK-12713: - I attached a PR and two duplicate JIRAs which are addressing the same

[jira] [Commented] (SPARK-12469) Consistent Accumulators for Spark

2015-12-25 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15071799#comment-15071799 ] Nan Zhu commented on SPARK-12469: - Just to bring the previous discussions about the topic here, https

[jira] [Comment Edited] (SPARK-12469) Consistent Accumulators for Spark

2015-12-25 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15071799#comment-15071799 ] Nan Zhu edited comment on SPARK-12469 at 12/26/15 2:44 AM: --- Just to bring

[jira] [Commented] (SPARK-12237) Unsupported message RpcMessage causes message retries

2015-12-10 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15051499#comment-15051499 ] Nan Zhu commented on SPARK-12237: - if that's the case, I don't think it would happen in the real world

[jira] [Commented] (SPARK-12237) Unsupported message RpcMessage causes message retries

2015-12-09 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15048651#comment-15048651 ] Nan Zhu commented on SPARK-12237: - may I ask how you found this issue? It seems that Master received

[jira] [Commented] (SPARK-12229) How to Perform spark submit of application written in scala from Node js

2015-12-09 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15048663#comment-15048663 ] Nan Zhu commented on SPARK-12229: - https://github.com/spark-jobserver/spark-jobserver might be a good

[jira] [Created] (SPARK-12021) Fishy test of "don't call ssc.stop in listener"

2015-11-26 Thread Nan Zhu (JIRA)
Nan Zhu created SPARK-12021: --- Summary: Fishy test of "don't call ssc.stop in listener" Key: SPARK-12021 URL: https://issues.apache.org/jira/browse/SPARK-12021 Project: Spark Issue

tests blocked at "don't call ssc.stop in listener"

2015-11-26 Thread Nan Zhu
https://issues.apache.org/jira/browse/SPARK-12021 Best, -- Nan Zhu http://codingcat.me

Re: A proposal for Spark 2.0

2015-11-12 Thread Nan Zhu
Being specific to Parameter Server, I think the current agreement is that PS shall exist as a third-party library instead of a component of the core code base, isn’t? Best, -- Nan Zhu http://codingcat.me On Thursday, November 12, 2015 at 9:49 AM, wi...@qq.com wrote: > Who has the i

[jira] [Commented] (SPARK-11402) Allow to define a custom driver runner and executor runner

2015-10-29 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14980584#comment-14980584 ] Nan Zhu commented on SPARK-11402: - I'm curious about what kind of functionalities you need

[jira] [Created] (SPARK-10315) remove document on spark.akka.failure-detector.threshold

2015-08-27 Thread Nan Zhu (JIRA)
Nan Zhu created SPARK-10315: --- Summary: remove document on spark.akka.failure-detector.threshold Key: SPARK-10315 URL: https://issues.apache.org/jira/browse/SPARK-10315 Project: Spark Issue Type

Re: Paper on Spark SQL

2015-08-17 Thread Nan Zhu
an extra “,” is at the end -- Nan Zhu http://codingcat.me On Monday, August 17, 2015 at 9:28 AM, Ted Yu wrote: I got 404 when trying to access the link. On Aug 17, 2015, at 5:31 AM, Todd bit1...@163.com (mailto:bit1...@163.com) wrote: Hi, I can't access http

[jira] [Created] (SPARK-9602) Remove 'Actor' from the comments

2015-08-04 Thread Nan Zhu (JIRA)
Nan Zhu created SPARK-9602: -- Summary: Remove 'Actor' from the comments Key: SPARK-9602 URL: https://issues.apache.org/jira/browse/SPARK-9602 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-9514) Add EventHubsReceiver to support Spark Streaming using Azure EventHubs

2015-08-01 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650308#comment-14650308 ] Nan Zhu commented on SPARK-9514: I think the best way to do it is to add a new component

[jira] [Commented] (SPARK-9514) Add EventHubsReceiver to support Spark Streaming using Azure EventHubs

2015-08-01 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650568#comment-14650568 ] Nan Zhu commented on SPARK-9514: Hi, [~shanyu], in Spark, we usually submit patches via

[jira] [Created] (SPARK-9516) Improve Thread Dump page

2015-07-31 Thread Nan Zhu (JIRA)
Nan Zhu created SPARK-9516: -- Summary: Improve Thread Dump page Key: SPARK-9516 URL: https://issues.apache.org/jira/browse/SPARK-9516 Project: Spark Issue Type: New Feature Components: Web

[jira] [Commented] (SPARK-9516) Improve Thread Dump page

2015-07-31 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650090#comment-14650090 ] Nan Zhu commented on SPARK-9516: I can work on it after finishing SPARK-8416 Improve

[jira] [Commented] (SPARK-9123) Spark HistoryServer load logs too slow and can load the latest logs

2015-07-16 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630699#comment-14630699 ] Nan Zhu commented on SPARK-9123: do you mind closing the duplicate JIRAs? SPARK-9124 SPARK

[jira] [Comment Edited] (SPARK-9123) Spark HistoryServer load logs too slow and can load the latest logs

2015-07-16 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630699#comment-14630699 ] Nan Zhu edited comment on SPARK-9123 at 7/17/15 3:05 AM: - do you

[jira] [Updated] (SPARK-9123) Spark HistoryServer load logs too slow and can load the latest logs

2015-07-16 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nan Zhu updated SPARK-9123: --- Target Version/s: (was: 1.5.0) Spark HistoryServer load logs too slow and can load the latest logs

[jira] [Commented] (SPARK-9123) Spark HistoryServer load logs too slow and can load the latest logs

2015-07-16 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630706#comment-14630706 ] Nan Zhu commented on SPARK-9123: just removed the target version label, see here: https

Re: [SparkScore]Performance portal for Apache Spark - WW26

2015-06-26 Thread Nan Zhu
Thank you, Jie! Very nice work! -- Nan Zhu http://codingcat.me On Friday, June 26, 2015 at 8:17 AM, Huang, Jie wrote: Correct. Your calculation is right! We have been aware of that kmeans performance drop also. According to our observation, it is caused by some unbalanced

Re: [SparkScore]Performance portal for Apache Spark - WW26

2015-06-26 Thread Nan Zhu
Thank you, Jie! Very nice work! -- Nan Zhu http://codingcat.me On Friday, June 26, 2015 at 8:17 AM, Huang, Jie wrote: Correct. Your calculation is right! We have been aware of that kmeans performance drop also. According to our observation, it is caused by some unbalanced

Re: [SparkScore]Performance portal for Apache Spark - WW26

2015-06-26 Thread Nan Zhu
, what happened to k-means in HiBench? Best, -- Nan Zhu http://codingcat.me On Friday, June 26, 2015 at 7:24 AM, Huang, Jie wrote: Intel® Xeon® CPU E5-2697

Re: [SparkScore]Performance portal for Apache Spark - WW26

2015-06-26 Thread Nan Zhu
, what happened to k-means in HiBench? Best, -- Nan Zhu http://codingcat.me On Friday, June 26, 2015 at 7:24 AM, Huang, Jie wrote: Intel® Xeon® CPU E5-2697

[jira] [Closed] (SPARK-1715) Ensure actor is self-contained in DAGScheduler

2015-05-15 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nan Zhu closed SPARK-1715. -- Resolution: Won't Fix Akka actor has been removed from DAGScheduler Ensure actor is self-contained

Re: What happened to the Row class in 1.3.0?

2015-04-06 Thread Nan Zhu
Row class was not documented mistakenly in 1.3.0 you can check the 1.3.1 API doc http://people.apache.org/~pwendell/spark-1.3.1-rc1-docs/api/scala/index.html#org.apache.spark.sql.Row Best, -- Nan Zhu http://codingcat.me On Monday, April 6, 2015 at 10:23 AM, ARose wrote: I am trying

Re: What happened to the Row class in 1.3.0?

2015-04-06 Thread Nan Zhu
Hi, Ted It’s here: https://github.com/apache/spark/blob/61b427d4b1c4934bd70ed4da844b64f0e9a377aa/sql/catalyst/src/main/java/org/apache/spark/sql/RowFactory.java Best, -- Nan Zhu http://codingcat.me On Monday, April 6, 2015 at 10:44 AM, Ted Yu wrote: I searched code base but didn't

[jira] [Commented] (SPARK-6646) Spark 2.0: Rearchitecting Spark for Mobile Platforms

2015-04-01 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390367#comment-14390367 ] Nan Zhu commented on SPARK-6646: super cool, Spark enables Bigger than Bigger Data

Re: java.io.NotSerializableException: org.apache.hadoop.hbase.client.Result

2015-03-31 Thread Nan Zhu
The example in https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/HBaseTest.scala might help Best, -- Nan Zhu http://codingcat.me On Tuesday, March 31, 2015 at 3:56 PM, Sean Owen wrote: Yep, it's not serializable: https://hbase.apache.org

[jira] [Commented] (SPARK-6592) API of Row trait should be presented in Scala doc

2015-03-29 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14385724#comment-14385724 ] Nan Zhu commented on SPARK-6592: ? I don't think that makes any difference, as the path

[jira] [Commented] (SPARK-6592) API of Row trait should be presented in Scala doc

2015-03-29 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14385964#comment-14385964 ] Nan Zhu commented on SPARK-6592: it contains the reason is that the input

<    1   2   3   4   5   6   7   >