[jira] [Created] (SPARK-17678) Spark 1.6 Scala-2.11 repl doesn't honor "spark.replClassServer.port"

2016-09-26 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-17678: --- Summary: Spark 1.6 Scala-2.11 repl doesn't honor "spark.replClassServer.port" Key: SPARK-17678 URL: https://issues.apache.org/jira/browse/SPARK-17678 Proj

[jira] [Commented] (SPARK-17637) Packed scheduling for Spark tasks across executors

2016-09-23 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15515836#comment-15515836 ] Saisai Shao commented on SPARK-17637: - [~zhanzhang] would you mind sharing more details about your

[jira] [Comment Edited] (SPARK-17624) Flaky test? StateStoreSuite maintenance

2016-09-21 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15512237#comment-15512237 ] Saisai Shao edited comment on SPARK-17624 at 9/22/16 5:36 AM: -- I cannot

[jira] [Commented] (SPARK-17624) Flaky test? StateStoreSuite maintenance

2016-09-21 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15512237#comment-15512237 ] Saisai Shao commented on SPARK-17624: - I cannot reproduce locally on my > Flaky t

[jira] [Updated] (SPARK-17604) Support purging aged file entry for FileStreamSource metadata log

2016-09-20 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao updated SPARK-17604: Issue Type: Sub-task (was: Improvement) Parent: SPARK-17267 > Support purging aged f

[jira] [Updated] (SPARK-17604) Support purging aged file entry for FileStreamSource metadata log

2016-09-20 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao updated SPARK-17604: Description: Currently with SPARK-15698, FileStreamSource metadata log will be compacted

[jira] [Created] (SPARK-17604) Support purging aged file entry for FileStreamSource metadata log

2016-09-20 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-17604: --- Summary: Support purging aged file entry for FileStreamSource metadata log Key: SPARK-17604 URL: https://issues.apache.org/jira/browse/SPARK-17604 Project: Spark

[jira] [Commented] (SPARK-15698) Ability to remove old metadata for structure streaming MetadataLog

2016-09-20 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15505799#comment-15505799 ] Saisai Shao commented on SPARK-15698: - I think [~rxin] set this target version before. I'm OK

[jira] [Commented] (SPARK-17566) "--master yarn --deploy-mode cluster" gives "Launching Python applications through spark-submit is currently only supported for local files"

2016-09-18 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15500522#comment-15500522 ] Saisai Shao commented on SPARK-17566: - I've already submitted a PR under SPARK-17512, since this JIRA

[jira] [Commented] (SPARK-17566) "--master yarn --deploy-mode cluster" gives "Launching Python applications through spark-submit is currently only supported for local files"

2016-09-18 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15500511#comment-15500511 ] Saisai Shao commented on SPARK-17566: - Shouldn't it be {{!isYarnCluster}}? Since we need to avoid

[jira] [Updated] (SPARK-17512) Specifying remote files for Python based Spark jobs in Yarn cluster mode not working

2016-09-18 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao updated SPARK-17512: Component/s: YARN > Specifying remote files for Python based Spark jobs in Yarn cluster m

[jira] [Commented] (SPARK-17512) Specifying remote files for Python based Spark jobs in Yarn cluster mode not working

2016-09-17 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15500178#comment-15500178 ] Saisai Shao commented on SPARK-17512: - This is due to some behavior changes during submitting spark

[jira] [Closed] (SPARK-17566) "--master yarn --deploy-mode cluster" gives "Launching Python applications through spark-submit is currently only supported for local files"

2016-09-17 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao closed SPARK-17566. --- Resolution: Duplicate > "--master yarn --deploy-mode cluster" gives "Launching P

[jira] [Commented] (SPARK-17566) "--master yarn --deploy-mode cluster" gives "Launching Python applications through spark-submit is currently only supported for local files"

2016-09-17 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15500145#comment-15500145 ] Saisai Shao commented on SPARK-17566: - Sorry I misunderstood your point, looks like it should

[jira] [Commented] (SPARK-17566) "--master yarn --deploy-mode cluster" gives "Launching Python applications through spark-submit is currently only supported for local files"

2016-09-17 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15500109#comment-15500109 ] Saisai Shao commented on SPARK-17566: - Can you confirm the above command you mentioned can be run

Re: Spark metrics when running with YARN?

2016-09-17 Thread Saisai Shao
dalone? > > Why are there 2 ways to get information, REST API and this Sink? > > > Best regards, Vladimir. > > > > > > > On Mon, Sep 12, 2016 at 3:53 PM, Vladimir Tretyakov < > vladimir.tretya...@sematext.com> wrote: > >> Hello Saisai Shao,

[jira] [Comment Edited] (SPARK-17522) [MESOS] More even distribution of executors on Mesos cluster

2016-09-15 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495243#comment-15495243 ] Saisai Shao edited comment on SPARK-17522 at 9/16/16 3:19 AM: -- [~sunrui] I

[jira] [Commented] (SPARK-17522) [MESOS] More even distribution of executors on Mesos cluster

2016-09-15 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495243#comment-15495243 ] Saisai Shao commented on SPARK-17522: - [~sunrui] I think the performance is depended on different

Re: Spark metrics when running with YARN?

2016-09-12 Thread Saisai Shao
Here is the yarn RM REST API for you to refer ( http://hadoop.apache.org/docs/r2.7.0/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html). You can use these APIs to query applications running on yarn. On Sun, Sep 11, 2016 at 11:25 PM, Jacek Laskowski wrote: > Hi Vladimir, > >

[jira] [Commented] (SPARK-17340) .sparkStaging not cleaned if application exited incorrectly

2016-09-07 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15470553#comment-15470553 ] Saisai Shao commented on SPARK-17340: - I think what [~asukhenko] mentioned in the description is one

[jira] [Commented] (SPARK-17340) .sparkStaging not cleaned if application exited incorrectly

2016-09-01 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15455080#comment-15455080 ] Saisai Shao commented on SPARK-17340: - yarn-client and yarn-cluster has different way to handle

[jira] [Comment Edited] (SPARK-17340) .sparkStaging not cleaned if application exited incorrectly

2016-09-01 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15454777#comment-15454777 ] Saisai Shao edited comment on SPARK-17340 at 9/1/16 11:02 AM: -- I think

[jira] [Commented] (SPARK-17340) .sparkStaging not cleaned if application exited incorrectly

2016-09-01 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15455076#comment-15455076 ] Saisai Shao commented on SPARK-17340: - You can try not kill local {{yarn#client}} process after

[jira] [Commented] (SPARK-17340) .sparkStaging not cleaned if application exited incorrectly

2016-09-01 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15455055#comment-15455055 ] Saisai Shao commented on SPARK-17340: - I'm saying yarn cluster mode, I think here in my comment

[jira] [Commented] (SPARK-17340) .sparkStaging not cleaned if application exited incorrectly

2016-09-01 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15454777#comment-15454777 ] Saisai Shao commented on SPARK-17340: - I think in your scenario, it is because you killed local

Re: Spark 2.0 and Yarn

2016-08-29 Thread Saisai Shao
This archive contains all the jars required by Spark runtime, you could zip all the jars under /jars and upload this archive to HDFS, then configure spark.yarn.archive with the path of this archive on HDFS. On Sun, Aug 28, 2016 at 9:59 PM, Srikanth Sampath wrote: >

[jira] [Commented] (SPARK-17204) Spark 2.0 off heap RDD persistence with replication factor 2 leads to in-memory data corruption

2016-08-24 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436222#comment-15436222 ] Saisai Shao commented on SPARK-17204: - Yes, I could reproduce this issue, but not constantly

[jira] [Commented] (SPARK-17204) Spark 2.0 off heap RDD persistence with replication factor 2 leads to in-memory data corruption

2016-08-24 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436213#comment-15436213 ] Saisai Shao commented on SPARK-17204: - I think to reflect the issue {{sc.range(0, 0)}} should

[jira] [Commented] (SPARK-17204) Spark 2.0 off heap RDD persistence with replication factor 2 leads to in-memory data corruption

2016-08-24 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436205#comment-15436205 ] Saisai Shao commented on SPARK-17204: - No, I tested in yarn cluster, not local mode. > Spark 2.0

[jira] [Commented] (SPARK-17204) Spark 2.0 off heap RDD persistence with replication factor 2 leads to in-memory data corruption

2016-08-24 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436179#comment-15436179 ] Saisai Shao commented on SPARK-17204: - It works OK in my local test with latest build: {code} val

Re: Re: Can we redirect Spark shuffle spill data to HDFS or Alluxio?

2016-08-24 Thread Saisai Shao
oud.com > > > *From:* Sun Rui <sunrise_...@163.com> > *Date:* 2016-08-24 22:17 > *To:* Saisai Shao <sai.sai.s...@gmail.com> > *CC:* tony@tendcloud.com; user <user@spark.apache.org> > *Subject:* Re: Can we redirect Spark shuffle spill data to HDFS or >

Re: Can we redirect Spark shuffle spill data to HDFS or Alluxio?

2016-08-24 Thread Saisai Shao
ty, and also there is additional overhead of network I/O and replica > of HDFS files. > > On Aug 24, 2016, at 21:02, Saisai Shao <sai.sai.s...@gmail.com> wrote: > > Spark Shuffle uses Java File related API to create local dirs and R/W > data, so it can only be worked with OS suppor

Re: Can we redirect Spark shuffle spill data to HDFS or Alluxio?

2016-08-24 Thread Saisai Shao
Spark Shuffle uses Java File related API to create local dirs and R/W data, so it can only be worked with OS supported FS. It doesn't leverage Hadoop FileSystem API, so writing to Hadoop compatible FS is not worked. Also it is not suitable to write temporary shuffle data into distributed FS, this

Re: dynamic allocation in Spark 2.0

2016-08-24 Thread Saisai Shao
This looks like Spark application is running into a abnormal status. From the stack it means driver could not send requests to AM, can you please check if AM is reachable or are there any other exceptions beside this one. >From my past test, Spark's dynamic allocation may run into some corner

[jira] [Updated] (SPARK-17209) Support manual credential updating in the run-time for Spark on YARN

2016-08-24 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao updated SPARK-17209: Summary: Support manual credential updating in the run-time for Spark on YARN (was: Support

[jira] [Updated] (SPARK-17209) Support manual credential updating in the run-time for Spark on YARN

2016-08-24 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao updated SPARK-17209: Description: Current Spark on YARN supports time based credential renewal and updating

[jira] [Created] (SPARK-17209) Support manual credential updating in the run-time

2016-08-24 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-17209: --- Summary: Support manual credential updating in the run-time Key: SPARK-17209 URL: https://issues.apache.org/jira/browse/SPARK-17209 Project: Spark Issue Type

[jira] [Commented] (SPARK-17148) NodeManager exit because of exception “Executor is not registered”

2016-08-22 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430402#comment-15430402 ] Saisai Shao commented on SPARK-17148: - I manually verified this by explicitly throwing

Re: Apache Spark toDebugString producing different output for python and scala repl

2016-08-15 Thread Saisai Shao
The implementation inside the Python API and Scala API for RDD is slightly different, so the difference of RDD lineage you printed is expected. On Tue, Aug 16, 2016 at 10:58 AM, DEEPAK SHARMA wrote: > Hi All, > > > Below is the small piece of code in scala and

[jira] [Created] (SPARK-17019) Expose off-heap memory usage in various places

2016-08-11 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-17019: --- Summary: Expose off-heap memory usage in various places Key: SPARK-17019 URL: https://issues.apache.org/jira/browse/SPARK-17019 Project: Spark Issue Type

[jira] [Commented] (AMBARI-18091) Use https url for Spark2 Service check when WireEncryption is enabled

2016-08-09 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/AMBARI-18091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414751#comment-15414751 ] Saisai Shao commented on AMBARI-18091: -- Please help to review, [~sumitmohanty] [~jluniya], thanks

[jira] [Assigned] (AMBARI-18091) Use https url for Spark2 Service check when WireEncryption is enabled

2016-08-09 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/AMBARI-18091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao reassigned AMBARI-18091: Assignee: Saisai Shao > Use https url for Spark2 Service check when WireEncrypt

Review Request 50945: Fix Spark2 service check failure when WE is enabled

2016-08-09 Thread Saisai Shao
/resources/common-services/SPARK2/2.0.0/package/scripts/service_check.py 565f924 Diff: https://reviews.apache.org/r/50945/diff/ Testing --- Manual verification is done. Thanks, Saisai Shao

[jira] [Commented] (SPARK-16966) App Name is a randomUUID even when "spark.app.name" exists

2016-08-09 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414481#comment-15414481 ] Saisai Shao commented on SPARK-16966: - Here is the code in {{SparkSubmitArguments}} to handle

[jira] [Commented] (SPARK-16966) App Name is a randomUUID even when "spark.app.name" exists

2016-08-09 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413201#comment-15413201 ] Saisai Shao commented on SPARK-16966: - Yes, agreed. A better way is to handle this app name thing

[jira] [Commented] (SPARK-16944) [MESOS] Improve data locality when launching new executors when dynamic allocation is enabled

2016-08-07 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15411299#comment-15411299 ] Saisai Shao commented on SPARK-16944: - Does Mesos have the similar concept like Yarn container, also

Re: submitting spark job with kerberized Hadoop issue

2016-08-07 Thread Saisai Shao
1. Standalone mode doesn't support accessing kerberized Hadoop, simply because it lacks the mechanism to distribute delegation tokens via cluster manager. 2. For the HBase token fetching failure, I think you have to do kinit to generate tgt before start spark application (

[jira] [Comment Edited] (SPARK-16914) NodeManager crash when spark are registering executor infomartion into leveldb

2016-08-07 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15411159#comment-15411159 ] Saisai Shao edited comment on SPARK-16914 at 8/8/16 1:48 AM: - So from your

[jira] [Commented] (SPARK-16914) NodeManager crash when spark are registering executor infomartion into leveldb

2016-08-07 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15411159#comment-15411159 ] Saisai Shao commented on SPARK-16914: - So from your description, is this exception mainly due

Re: spark 2.0.0 - how to build an uber-jar?

2016-08-03 Thread Saisai Shao
I guess you're mentioning about spark assembly uber jar. In Spark 2.0, there's no uber jar, instead there's a jars folder which contains all jars required in the run-time. For the end user it is transparent, the way to submit spark application is still the same. On Wed, Aug 3, 2016 at 4:51 PM,

[jira] [Updated] (SPARK-16871) Support getting HBase tokens from multiple clusters dynamically

2016-08-03 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao updated SPARK-16871: Summary: Support getting HBase tokens from multiple clusters dynamically (was: Support getting

[jira] [Created] (SPARK-16871) Support getting HBase tokens from multiple clusters and dynamically

2016-08-03 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-16871: --- Summary: Support getting HBase tokens from multiple clusters and dynamically Key: SPARK-16871 URL: https://issues.apache.org/jira/browse/SPARK-16871 Project: Spark

Re: Spark on yarn, only 1 or 2 vcores getting allocated to the containers getting created.

2016-08-03 Thread Saisai Shao
Use dominant resource calculator instead of default resource calculator will get the expected vcores as you wanted. Basically by default yarn does not honor cpu cores as resource, so you will always see vcore is 1 no matter what number of cores you set in spark. On Wed, Aug 3, 2016 at 12:11 PM,

Re: Spark on yarn, only 1 or 2 vcores getting allocated to the containers getting created.

2016-08-03 Thread Saisai Shao
Use dominant resource calculator instead of default resource calculator will get the expected vcores as you wanted. Basically by default yarn does not honor cpu cores as resource, so you will always see vcore is 1 no matter what number of cores you set in spark. On Wed, Aug 3, 2016 at 12:11 PM,

[jira] [Commented] (SPARK-16864) Comprehensive version info

2016-08-03 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405487#comment-15405487 ] Saisai Shao commented on SPARK-16864: - A program way to get spark version is to call {{SparkContext

[jira] [Commented] (SPARK-14453) Remove SPARK_JAVA_OPTS environment variable

2016-08-02 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405112#comment-15405112 ] Saisai Shao commented on SPARK-14453: - If you want to fix this issue, it would be better target

Re: Review Request 50594: Fix Spark hdp.version issues in upgrading and fresh install

2016-08-02 Thread Saisai Shao
. Thanks, Saisai Shao

Re: Review Request 50594: Fix Spark hdp.version issues in upgrading and fresh install

2016-08-01 Thread Saisai Shao
with my hotfix the config > > will remain -Dhdp.version={{hdp_full_version}}. > > Saisai Shao wrote: > From my understanding, you mean that in the params.py we should also take > care of {{hdp_full_version}} if amabri is upgraded from lower version. Can > you please explain more

Re: Review Request 50594: Fix Spark hdp.version issues in upgrading and fresh install

2016-08-01 Thread Saisai Shao
get the specific version of Ambari and how to upgrade to the specific version? - Saisai --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50594/#review144425 --

Re: Review Request 50594: Fix Spark hdp.version issues in upgrading and fresh install

2016-08-01 Thread Saisai Shao
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50594/#review144418 ------- On Aug. 1, 2016, 1:22 a.m., Saisai Shao wrote: > > -

Re: Review Request 50594: Fix Spark hdp.version issues in upgrading and fresh install

2016-08-01 Thread Saisai Shao
e addition of -Dhdp.version should also be under condition > > check_stack_feature(StackFeature.SPARK_JAVA_OPTS_SUPPORT, > > effective_version). > > > > I assume -Dhdp.version is to be added only for HDP-2.3 and below. > > Saisai Shao wrote: >

Re: Review Request 50594: Fix Spark hdp.version issues in upgrading and fresh install

2016-08-01 Thread Saisai Shao
------- On Aug. 1, 2016, 1:22 a.m., Saisai Shao wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/50594/ > ---

Re: Getting error, when I do df.show()

2016-08-01 Thread Saisai Shao
> > java.lang.NoClassDefFoundError: spray/json/JsonReader > > at > com.memsql.spark.pushdown.MemSQLPhysicalRDD$.fromAbstractQueryTree(MemSQLPhysicalRDD.scala:95) > > at > com.memsql.spark.pushdown.MemSQLPushdownStrategy.apply(MemSQLPushdownStrategy.scala:49) >

[jira] [Comment Edited] (SPARK-16815) Dataset[List[T]] leads to ArrayStoreException

2016-08-01 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401566#comment-15401566 ] Saisai Shao edited comment on SPARK-16815 at 8/1/16 6:01 AM: - >From

[jira] [Commented] (SPARK-16815) Dataset[List[T]] leads to ArrayStoreException

2016-08-01 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401566#comment-15401566 ] Saisai Shao commented on SPARK-16815: - >From my understanding you can use {c

[jira] [Commented] (SPARK-16817) Enable storing of shuffle data in Alluxio

2016-07-31 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401432#comment-15401432 ] Saisai Shao commented on SPARK-16817: - What's difference compared to use ramdisk to store shuffle

Re: Review Request 50594: Fix Spark hdp.version issues in upgrading and fresh install

2016-07-31 Thread Saisai Shao
/test/python/stacks/2.3/SPARK/test_spark_thrift_server.py a1abdfa Diff: https://reviews.apache.org/r/50594/diff/ Testing --- Manual test with different scenarios: 1. Fresh install of HDP 2.3.6, 2.4.3, 2.5.0 2. Upgrade for 2.3.6 to 2.5.0. 3. Downgrade from 2.5.0 to 2.3.6. Thanks, Saisai

Re: Review Request 50594: Fix Spark hdp.version issues in upgrading and fresh install

2016-07-31 Thread Saisai Shao
>action="delete" > >) Done - Saisai ------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50594/#review144288

Re: Review Request 50594: Fix Spark hdp.version issues in upgrading and fresh install

2016-07-31 Thread Saisai Shao
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50594/#review144288 ------- On July 31, 2016, 3:44 a.m., Saisai Shao wrote: > > -

Re: Review Request 50594: Fix Spark hdp.version issues in upgrading and fresh install

2016-07-30 Thread Saisai Shao
/test/python/stacks/2.3/SPARK/test_spark_thrift_server.py a1abdfa Diff: https://reviews.apache.org/r/50594/diff/ Testing --- Manual test with different scenarios: 1. Fresh install of HDP 2.3.6, 2.4.3, 2.5.0 2. Upgrade for 2.3.6 to 2.5.0. 3. Downgrade from 2.5.0 to 2.3.6. Thanks, Saisai

[jira] [Commented] (AMBARI-17954) Fix Spark hdp.version issues in upgrading and fresh install

2016-07-28 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/AMBARI-17954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398740#comment-15398740 ] Saisai Shao commented on AMBARI-17954: -- CC [~sumitmohanty] [~jluniya], please help to review

[jira] [Created] (AMBARI-17954) Fix Spark hdp.version issues in upgrading and fresh install

2016-07-28 Thread Saisai Shao (JIRA)
Saisai Shao created AMBARI-17954: Summary: Fix Spark hdp.version issues in upgrading and fresh install Key: AMBARI-17954 URL: https://issues.apache.org/jira/browse/AMBARI-17954 Project: Ambari

[jira] [Commented] (SPARK-16085) Spark stand-alone ui redirects to RM application master UI for yarn-client mode

2016-07-27 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395705#comment-15395705 ] Saisai Shao commented on SPARK-16085: - Unfortunately, there's no such configuration for Spark

[jira] [Commented] (SPARK-16708) ExecutorAllocationManager.numRunningTasks can be negative when stage retry

2016-07-26 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15393377#comment-15393377 ] Saisai Shao commented on SPARK-16708: - Looks similar to SPARK-11334, and I have a patch on it, though

Re: yarn.exceptions.ApplicationAttemptNotFoundException when trying to shut down spark applicaiton via yarn applicaiton --kill

2016-07-26 Thread Saisai Shao
Several useful information can be found here ( https://issues.apache.org/jira/browse/YARN-1842), though personally I haven't met this problem before. Thanks Saisai On Tue, Jul 26, 2016 at 2:21 PM, Yu Wei wrote: > Hi guys, > > > When I tried to shut down spark application

[jira] [Commented] (SPARK-16723) exception in thread main org.apache.spark.sparkexception application finished with failed status

2016-07-25 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15393033#comment-15393033 ] Saisai Shao commented on SPARK-16723: - So maybe this application is not yet started in the yarn side

[jira] [Commented] (SPARK-16723) exception in thread main org.apache.spark.sparkexception application finished with failed status

2016-07-25 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15393003#comment-15393003 ] Saisai Shao commented on SPARK-16723: - Did you enable log aggregation in YARN, if not this command

[jira] [Comment Edited] (SPARK-16723) exception in thread main org.apache.spark.sparkexception application finished with failed status

2016-07-25 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15393003#comment-15393003 ] Saisai Shao edited comment on SPARK-16723 at 7/26/16 1:36 AM: -- Did you

[jira] [Commented] (SPARK-16723) exception in thread main org.apache.spark.sparkexception application finished with failed status

2016-07-25 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15392980#comment-15392980 ] Saisai Shao commented on SPARK-16723: - {{yarn logs -applicationId application_1467990031555_0089

[jira] [Commented] (SPARK-16723) exception in thread main org.apache.spark.sparkexception application finished with failed status

2016-07-25 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15392965#comment-15392965 ] Saisai Shao commented on SPARK-16723: - I think you should check the AM and executor logs to see

Re: How to submit app in cluster mode? port 7077 or 6066

2016-07-21 Thread Saisai Shao
I think both 6066 and 7077 can be worked. 6066 is using the REST way to submit application, while 7077 is the legacy way. From user's aspect, it should be transparent and no need to worry about the difference. - *URL:* spark://hw12100.local:7077 - *REST URL:* spark://hw12100.local:6066

[jira] [Commented] (AMBARI-16864) Add unit tests for Spark2 service definition

2016-07-18 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/AMBARI-16864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15383435#comment-15383435 ] Saisai Shao commented on AMBARI-16864: -- Done, patch updated. > Add unit tests for Spark2 serv

Re: scala.MatchError on stand-alone cluster mode

2016-07-15 Thread Saisai Shao
The error stack is throwing from your code: Caused by: scala.MatchError: [Ljava.lang.String;@68d279ec (of class [Ljava.lang.String;) at com.jd.deeplog.LogAggregator$.main(LogAggregator.scala:29) at com.jd.deeplog.LogAggregator.main(LogAggregator.scala) I think you should debug

[jira] [Updated] (SPARK-16540) Jars specified with --jars will added twice when running on YARN

2016-07-14 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao updated SPARK-16540: Description: Currently when running spark on yarn, jars specified with \--jars, \--packages

[jira] [Created] (SPARK-16540) Jars specified with --jars will added twice when running on YARN

2016-07-14 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-16540: --- Summary: Jars specified with --jars will added twice when running on YARN Key: SPARK-16540 URL: https://issues.apache.org/jira/browse/SPARK-16540 Project: Spark

[jira] [Commented] (SPARK-16534) Kafka 0.10 Python support

2016-07-14 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376449#comment-15376449 ] Saisai Shao commented on SPARK-16534: - Maybe I can take a try if no one is working on this :). BTW do

[jira] [Commented] (SPARK-16534) Kafka 0.10 Python support

2016-07-14 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376432#comment-15376432 ] Saisai Shao commented on SPARK-16534: - Is there anyone working on this? > Kafka 0.10 Python supp

[jira] [Commented] (SPARK-16521) Add support of parameterized configuration for SparkConf

2016-07-13 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374612#comment-15374612 ] Saisai Shao commented on SPARK-16521: - I see, sorry about the duplication. > Add supp

[jira] [Closed] (SPARK-16521) Add support of parameterized configuration for SparkConf

2016-07-13 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao closed SPARK-16521. --- Resolution: Duplicate > Add support of parameterized configuration for SparkC

[jira] [Commented] (SPARK-16522) [MESOS] Spark application throws exception on exit

2016-07-13 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374583#comment-15374583 ] Saisai Shao commented on SPARK-16522: - Perhaps there's race condition when exiting the Spark

[jira] [Updated] (SPARK-16521) Add support of parameterized configuration for SparkConf

2016-07-13 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao updated SPARK-16521: Priority: Minor (was: Major) > Add support of parameterized configuration for SparkC

[jira] [Created] (SPARK-16521) Add support of parameterized configuration for SparkConf

2016-07-13 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-16521: --- Summary: Add support of parameterized configuration for SparkConf Key: SPARK-16521 URL: https://issues.apache.org/jira/browse/SPARK-16521 Project: Spark Issue

[jira] [Commented] (SPARK-16428) Spark file system watcher not working on Windows

2016-07-12 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15372885#comment-15372885 ] Saisai Shao commented on SPARK-16428: - bq. Spark detected those files with the above terminal output

[jira] [Commented] (SPARK-16435) Behavior changes if initialExecutor is less than minExecutor for dynamic allocation

2016-07-11 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371975#comment-15371975 ] Saisai Shao commented on SPARK-16435: - OK, I will file a small patch to add the warning log about

[jira] [Created] (SPARK-16435) Behavior changes if initialExecutor is less than minExecutor for dynamic allocation

2016-07-07 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-16435: --- Summary: Behavior changes if initialExecutor is less than minExecutor for dynamic allocation Key: SPARK-16435 URL: https://issues.apache.org/jira/browse/SPARK-16435

[jira] [Updated] (SPARK-14743) Improve delegation token handling in secure clusters

2016-07-06 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao updated SPARK-14743: Component/s: YARN > Improve delegation token handling in secure clust

[jira] [Closed] (SPARK-16342) Add a new Configurable Token Manager for Spark Running on YARN

2016-07-06 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao closed SPARK-16342. --- Resolution: Duplicate > Add a new Configurable Token Manager for Spark Running on Y

[jira] [Commented] (SPARK-16342) Add a new Configurable Token Manager for Spark Running on YARN

2016-07-06 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365535#comment-15365535 ] Saisai Shao commented on SPARK-16342: - Close as JIRA as duplicated and move to SPARK-14743. >

[jira] [Commented] (SPARK-14743) Improve delegation token handling in secure clusters

2016-07-06 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365534#comment-15365534 ] Saisai Shao commented on SPARK-14743: - Post design doc here and move SPARK-16342 to here. > Impr

[jira] [Comment Edited] (SPARK-14743) Improve delegation token handling in secure clusters

2016-07-06 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365534#comment-15365534 ] Saisai Shao edited comment on SPARK-14743 at 7/7/16 3:18 AM: - Post design doc

<    7   8   9   10   11   12   13   14   15   16   >