That fixed it!
Thank you!
--Ben
On Thu, Apr 14, 2016 at 5:53 PM, Marcelo Vanzin <van...@cloudera.com> wrote:
> On Thu, Apr 14, 2016 at 2:14 PM, Benjamin Zaitlen <quasi...@gmail.com>
> wrote:
> >> spark-submit --master yarn-cluster /home/ubuntu/test_spark.p
Hi All,
I'm trying to use the --files option with yarn:
spark-submit --master yarn-cluster /home/ubuntu/test_spark.py --files
> /home/ubuntu/localtest.txt#appSees.txt
I never see the file in HDFS or in the yarn containers. Am I doing
something incorrect ?
I'm running spark 1.6.0
Thanks,
Hi All,
Sean patiently worked with me in solving this issue. The problem was
entirely my fault in settings MAVEN_OPTS env variable was set and was
overriding everything.
--Ben
On Tue, Sep 8, 2015 at 1:37 PM, Benjamin Zaitlen <quasi...@gmail.com> wrote:
> Yes, just reran with the
Hi All,
I'm trying to build a distribution off of the latest in master and I keep
getting errors on MQTT and the build fails. I'm running the build on a
m1.large which has 7.5 GB of RAM and no other major processes are running.
MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M
8, 2015 at 1:53 PM, Benjamin Zaitlen <quasi...@gmail.com>
> wrote:
> > Hi All,
> >
> > I'm trying to build a distribution off of the latest in master and I keep
> > getting errors on MQTT and the build fails. I'm running the build on a
> > m1.large which has
a
> PR to change any occurrences of lower recommendations to 3gb.
>
> On Tue, Sep 8, 2015 at 3:02 PM, Benjamin Zaitlen <quasi...@gmail.com>
> wrote:
> > Ah, right. Should've caught that.
> >
> > The docs seem to recommend 2gb. Should that be increased as we
s -Pyarn -Phive -Phive-thriftserver -Phadoop-2.4
> >> -Dhadoop.version=2.4.0
> >
> >
> > So the heap size is still 2g even with MAVEN_OPTS set with 4g. I noticed
> > that within build/mvn _COMPILE_JVM_OPTS is set to 2g and this is what
> > ZINC_OPTS is set to.
> >
hile compiling ?
>
> Cheers
>
> On Tue, Sep 8, 2015 at 7:56 AM, Benjamin Zaitlen <quasi...@gmail.com>
> wrote:
>
>> I'm still getting errors with 3g. I've increase to 4g and I'll report
>> back
>>
>> To be clear:
>>
>> export MAVEN_OPTS=&quo
Hi All,
I'm not quite clear on whether submitting a python application to spark
standalone on ec2 is possible.
Am I reading this correctly:
*A common deployment strategy is to submit your application from a gateway
machine that is physically co-located with your worker machines (e.g.
Master
HI Andy,
I built an anaconda/spark AMI a few months ago. I'm still iterating on it
so if things break please report them. If you want to give it awhirl:
./spark-ec2 -k my_key -i ~/.ssh/mykey.rsa -a ami-3ecd0c56
The nice thing about anaconda is that it come pre-baked with
ipython-notebook,
I may have missed this but is it possible to select on datetime in a
SparkSQL query
jan1 = sqlContext.sql(SELECT * FROM Stocks WHERE datetime = '2014-01-01')
Additionally, is there a guide as to what SQL is valid? The guide says,
Note that Spark SQL currently uses a very basic SQL parser It
, 2014 at 11:54 AM, Benjamin Zaitlen quasi...@gmail.com
wrote:
Hi All,
I'm a dev a Continuum and we are developing a fair amount of tooling
around
Spark. A few days ago someone expressed interest in numpy+pyspark and
Anaconda came up as a reasonable solution.
I spent a number of hours
Hi All,
I'm a dev a Continuum and we are developing a fair amount of tooling around
Spark. A few days ago someone expressed interest in numpy+pyspark and
Anaconda came up as a reasonable solution.
I spent a number of hours yesterday trying to rework the base Spark AMI on
EC2 but sadly was
13 matches
Mail list logo