Spark - Kinesis integration needs improvements

2017-03-30 Thread Yash Sharma
Hello fellow spark devs, hope you are doing fabulous, Dropping a brain dump here about the Spark kinesis integration. I am able to get spark kinesis to work perfectly under ideal conditions, but see a lot of open ends when things are not so ideal. I feel there are lot of open ends and are

Predicate not getting pusdhown to PrunedFilterScan

2017-03-30 Thread Hanumath Rao Maduri
Hello All, I am working on creating a new PrunedFilteredScan operator which has the ability to execute the predicates pushed to this operator. However What I observed is that if column with deep in the hierarchy is used then it is not getting pushed down. SELECT tom._id, tom.address.city from

Re: [Spark on mesos] Spark framework not re-registered and lost after mesos master restarted

2017-03-30 Thread Timothy Chen
Hi Yu, As mentioned earlier, currently the Spark framework will not re-register as the failover_timeout is not set and there is no configuration available yet. It's only enabled in MesosClusterScheduler since it's meant to be a HA framework. We should add that configuration for users that want

Re: [Spark on mesos] Spark framework not re-registered and lost after mesos master restarted

2017-03-30 Thread Yu Wei
Hi Tim, I tested the scenario again with settings as below, [dcos@agent spark-2.0.2-bin-hadoop2.7]$ cat conf/spark-defaults.conf spark.deploy.recoveryMode ZOOKEEPER spark.deploy.zookeeper.url 192.168.111.53:2181 spark.deploy.zookeeper.dir /spark spark.executor.memory 512M spark.mesos.principal

Predicate not getting pusdhown to PrunedFilterScan

2017-03-30 Thread Hanumath Rao Maduri
Hello All, I am working on creating a new PrunedFilteredScan operator which has the ability to execute the predicates pushed to this operator. However What I observed is that if column with deep in the hierarchy is used then it is not getting pushed down. SELECT tom._id, tom.address.city from

[VOTE] Apache Spark 2.1.1 (RC2)

2017-03-30 Thread Michael Armbrust
Please vote on releasing the following candidate as Apache Spark version 2.1.0. The vote is open until Sun, April 2nd, 2018 at 16:30 PST and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 2.1.1 [ ] -1 Do not release this package because ...

Re: Outstanding Spark 2.1.1 issues

2017-03-30 Thread Holden Karau
Hi All, Just circling back to see if there is anything blocking the RC that isn't being tracked in JIRA? The current in progress list from ((affectedVersion = 2.1.1 AND cf[12310320] is Empty) OR fixVersion = 2.1.1 OR cf[12310320] = "2.1.1") AND project = spark AND resolution = Unresolved ORDER

Re: [Spark on mesos] Spark framework not re-registered and lost after mesos master restarted

2017-03-30 Thread Timothy Chen
I think failover isn't enabled on regular Spark job framework, since we assume jobs are more ephemeral. It could be a good setting to add to the Spark framework to enable failover. Tim > On Mar 30, 2017, at 10:18 AM, Yu Wei wrote: > > Hi guys, > > I encountered a

Re: planning & discussion for larger scheduler changes

2017-03-30 Thread Tom Graves
If we are worried about major changes destabilizing current code (which I can understand) only way around that is to make it pluggable or configurable.  For major changes it seems like making it pluggable is cleaner from a code being cluttered point of view. But it also means you may have to

[Spark on mesos] Spark framework not re-registered and lost after mesos master restarted

2017-03-30 Thread Yu Wei
Hi guys, I encountered a problem about spark on mesos. I setup mesos cluster and launched spark framework on mesos successfully. Then mesos master was killed and started again. However, spark framework couldn't be re-registered again as mesos agent does. I also couldn't find any error logs.