Have you seen SPARK-5836 ?
Note TD's comment at the end.
Cheers
On Wed, Nov 18, 2015 at 7:28 PM, swetha wrote:
> Hi,
>
> We have a lot of temp files that gets created due to shuffles caused by
> group by. How to clear the files that gets created due to intermediate
>
I am a bit curious:
Hbase depends on hdfs.
Has hdfs support for Mesos been fully implemented ?
Last time I checked, there was still work to be done.
Thanks
> On Nov 17, 2015, at 1:06 AM, 임정택 wrote:
>
> Oh, one thing I missed is, I built Spark 1.4.1 Cluster with 6 nodes of
ISDATE() is currently not supported.
Since it is SQL Server specific, I guess it wouldn't be added to Spark.
On Mon, Nov 16, 2015 at 10:46 PM, Ravisankar Mani wrote:
> Hi Everyone,
>
>
> In MSSQL server suppprt "ISDATE()" function is used to fine current
> column values date
Please take a look at the following for example:
./core/src/main/scala/org/apache/spark/api/python/PythonPartitioner.scala
./core/src/main/scala/org/apache/spark/Partitioner.scala
Cheers
On Tue, Nov 17, 2015 at 9:24 AM, prateek arora
wrote:
> Hi
> Thanks
> I am new
Looking in local maven repo, breeze_2.10-0.7.jar contains DefaultArrayValue
:
jar tvf
/Users/tyu/.m2/repository//org/scalanlp/breeze_2.10/0.7/breeze_2.10-0.7.jar
| grep !$
jar tvf
/Users/tyu/.m2/repository//org/scalanlp/breeze_2.10/0.7/breeze_2.10-0.7.jar
| grep DefaultArrayValue
369 Wed Mar
ter and Mesos cluster for some reasons, and
> I just can make it work via spark-submit or spark-shell / zeppelin with
> newly initialized SparkContext.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 2015-11-17 22:17 GMT+09:00 Ted Yu <yuzhih...@gmail.com>:
>
>> I am a b
Is the Scala version in Intellij the same as the one used by sbt ?
Cheers
On Tue, Nov 17, 2015 at 6:45 PM, 金国栋 wrote:
> Hi!
>
> I tried to build spark source code from github, and I successfully built
> it from command line using `*sbt/sbt assembly*`. While I encountered an
>
Have you considered polling Cassandra mailing list ?
A brief search led to CASSANDRA-7894
FYI
On Tue, Nov 17, 2015 at 7:24 PM, satish chandra j
wrote:
> HI All,
> I am getting "*.UnauthorizedException: User has no SELECT
> permission on or any of its parents*"
I don't think you should call ssc.stop() in StreamingListenerBus thread.
Please stop the context asynchronously.
BTW I have a pending PR:
https://github.com/apache/spark/pull/9741
On Tue, Nov 17, 2015 at 1:50 PM, jiten wrote:
> Hi,
>
> We're using Spark 1.5 streaming.
Which release of Spark are you using ?
Can you take stack trace and pastebin it ?
Thanks
On Mon, Nov 16, 2015 at 5:50 AM, Kayode Odeyemi wrote:
> ./spark-submit --class com.migration.UpdateProfiles --executor-memory 8g
> ~/migration-profiles-0.1-SNAPSHOT.jar
>
> is stuck
There is no such configuration parameter for selecting which nodes the
application master is running on.
Cheers
On Mon, Nov 16, 2015 at 12:52 PM, Alex Rovner
wrote:
> I was wondering if there is analogues configuration parameter to
>
Wangda, YARN committer, told me that support for selecting which nodes the
application master is running on is integrated to the upcoming hadoop 2.8.0
release.
Stay tuned.
On Mon, Nov 16, 2015 at 1:36 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> There is no such configuration
Please take a look at test_column_operators in python/pyspark/sql/tests.py
FYI
On Sat, Nov 14, 2015 at 11:49 PM, YaoPau wrote:
> I'm using pyspark 1.3.0, and struggling with what should be simple.
> Basically, I'd like to run this:
>
> site_logs.filter(lambda r: 'page_row'
Please take a look at http://www.infoq.com/articles/tuning-tips-G1-GC
Cheers
On Sat, Nov 14, 2015 at 10:03 PM, Renu Yadav wrote:
> I have tried with G1 GC .Please if anyone can provide their setting for GC.
> At code level I am :
> 1.reading orc table usind dataframe
> 2.map
>
> Sent from my iPhone
>
>> On 14 Nov, 2015, at 11:21 pm, Ted Yu <yuzhih...@gmail.com> wrote:
>>
>> Which release are you using ?
>> If older than 1.5.0, you miss some fixes such as SPARK-9952
>>
>> Cheers
>>
>>> On S
Which release are you using ?
If older than 1.5.0, you miss some fixes such as SPARK-9952
Cheers
On Sat, Nov 14, 2015 at 6:35 PM, Jerry Lam wrote:
> Hi spark users and developers,
>
> Have anyone experience the slow startup of a job when it contains a stage
> with over 4
I searched the code base and looked at:
https://spark.apache.org/docs/latest/running-on-yarn.html
I didn't find mapred.max.map.failures.percent or its counterpart.
FYI
On Fri, Nov 13, 2015 at 9:05 AM, Nicolae Marasoiu <
nicolae.maras...@adswizz.com> wrote:
> Hi,
>
>
> I know a task can fail 2
I tried with master branch.
scala> sc.getConf.getAll.foreach(println)
(spark.executor.id,driver)
(spark.driver.memory,16g)
(spark.unsafe.offHeap,true)
(spark.driver.host,172.18.128.12)
(spark.repl.class.uri,http://172.18.128.12:59780)
(spark.sql.tungsten.enabled,true)
will stick with this solution for the moment even if I find java Date
> ugly.
>
> Thanks for your help.
>
> 2015-11-11 15:54 GMT+01:00 Ted Yu <yuzhih...@gmail.com>:
>
>> In case you need to adjust log4j properties, see the following thread:
>>
>>
>> http
In case you need to adjust log4j properties, see the following thread:
http://search-hadoop.com/m/q3RTtJHkzb1t0J66=Re+Spark+Streaming+Log4j+Inside+Eclipse
Cheers
On Tue, Nov 10, 2015 at 1:28 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> I took a look at
> https://github.com/JodaOrg/jo
Please take a look
at launcher/src/test/java/org/apache/spark/launcher/SparkLauncherSuite.java
to see how app.getInputStream() and app.getErrorStream() are handled.
In master branch, the Suite is located
at core/src/test/java/org/apache/spark/launcher/SparkLauncherSuite.java
FYI
On Wed, Nov 11,
Have you tried the following ?
build/sbt "sql/test-only *"
Cheers
On Wed, Nov 11, 2015 at 7:13 PM, weoccc wrote:
> Hi,
>
> I am wondering how to run unit test for specific spark component only.
>
> mvn test -DwildcardSuites="org.apache.spark.sql.*" -Dtest=none
>
> The above
n*.
>
> regards,
> --Jakob
>
> *I'm myself pretty new to the Spark community so please don't take my
> words on it as gospel
>
>
> On 11 November 2015 at 15:25, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> For #1, the published jars are usable.
>> Howeve
Looks like the delegation token should be renewed.
Mind trying the following ?
Thanks
diff --git
a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerB
index 20771f6..e3c4a5a 100644
expose this issue in the PR build. Because
> SBT build doesn't do the shading, now it's hard for us to find similar
> issues in the PR build.
>
> Best Regards,
> Shixiong Zhu
>
> 2015-11-09 18:47 GMT-08:00 Ted Yu <yuzhih...@gmail.com>:
>
>> Created https://githu
but am having the same problem.
>
> I ran:
>
> ./bin/pyspark --master yarn-client
>
> >> sc.stop()
> >> sc = SparkContext()
>
> Same error dump as below.
>
> Do I need to pass something to the new sparkcontext ?
>
> Thanks,
> Mike
>
> [imag
'true'), (u'spark.ssl.trustStore',
> u'xxx.truststore')]
>
> I am not really familiar with "spark.yarn.credentials.file" and had
> thought it was created automatically after communicating with YARN to get
> tokens.
>
> Thanks,
> Mike
>
>
> [image: Inactive hide details for Ted Yu ---11/1
For #1, the published jars are usable.
However, you should build from source for your specific combination of
profiles.
Cheers
On Wed, Nov 11, 2015 at 3:22 PM, shajra-cogscale
wrote:
> Hi,
>
> My company isn't using Spark in production yet, but we are using a bit of
Can you show the stack trace for the NPE ?
Which release of Spark are you using ?
Cheers
On Tue, Nov 10, 2015 at 8:20 AM, romain sagean
wrote:
> Hi community,
> I try to apply the function below during a flatMapValues or a map but I
> get a nullPointerException with the
at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1400)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1361)
> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> 15/11/
n the PR build.
>
> Best Regards,
> Shixiong Zhu
>
> 2015-11-09 18:47 GMT-08:00 Ted Yu <yuzhih...@gmail.com>:
>
>> Created https://github.com/apache/spark/pull/9585
>>
>> Cheers
>>
>> On Mon, Nov 9, 2015 at 6:39 PM, Josh Rosen <joshro...@databric
I would suggest asking this question on SPARK-2365 since IndexedRDD has not
been released (upstream)
Cheers
On Mon, Nov 9, 2015 at 1:34 PM, swetha wrote:
>
> Hi ,
>
> What is the appropriate dependency to include for Spark Indexed RDD? I get
> compilation error if I
9, 2015 at 6:13 PM, Michael Armbrust <mich...@databricks.com>
> wrote:
>
>> Yeah, we should probably remove that.
>>
>> On Mon, Nov 9, 2015 at 5:54 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>>
>>> If there is no option to let shell skip processing @Visi
If there is no option to let shell skip processing @VisibleForTesting ,
should the annotation be dropped ?
Cheers
On Mon, Nov 9, 2015 at 5:50 PM, Marcelo Vanzin wrote:
> We've had this in the past when using "@VisibleForTesting" in classes
> that for some reason the shell
Which branch did you perform the build with ?
I used the following command yesterday:
mvn -Phive -Phive-thriftserver -Pyarn -Phadoop-2.4 -Dhadoop.version=2.7.0
package -DskipTests
Spark shell was working.
Building with latest master branch.
On Mon, Nov 9, 2015 at 10:37 AM, Zhan Zhang
I backtracked to:
ef362846eb448769bcf774fc9090a5013d459464
The issue was still there.
FYI
On Mon, Nov 9, 2015 at 10:46 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> Which branch did you perform the build with ?
>
> I used the following command yesterday:
> mvn -Phive -Phive-t
Please see
https://issues.apache.org/jira/browse/PARQUET-124
> On Nov 8, 2015, at 11:43 PM, swetha wrote:
>
> Hi,
>
> I see unwanted Warning when I try to save a Parquet file in hdfs in Spark.
> Please find below the code and the Warning message. Any idea as to how
Please consider using NoSQL engine such as hbase.
Cheers
> On Nov 9, 2015, at 3:03 PM, Andrés Ivaldi wrote:
>
> Hi,
> I'm also considering something similar, Spark plain is too slow for my case,
> a possible solution is use Spark as Multiple Source connector and basic
>
Which release of Spark were you using ?
Can you post the command you used to run WordCount ?
Cheers
On Sat, Nov 7, 2015 at 7:59 AM, Shashi Vishwakarma wrote:
> I am trying to run simple word count job in spark but I am getting
> exception while running job.
>
> For
ypically, HiveContext has more functionality than SQLContext. In what case
> you have to use SQLContext that cannot be done by HiveContext?
>
> Thanks.
>
> Zhan Zhang
>
> On Nov 6, 2015, at 10:43 AM, Jerry Lam <chiling...@gmail.com> wrote:
>
> What is interesting
Can you tell us a bit more about your use case ?
Are the two RDDs expected to be of roughly equal size or, to be of vastly
different sizes ?
Thanks
On Fri, Nov 6, 2015 at 3:21 PM, swetha wrote:
> Hi,
>
> What is the efficient way to join two RDDs? Would converting
You mentioned resourcemanager but not nodemanagers.
I think you need to install Spark on nodes running nodemanagers.
Cheers
On Fri, Nov 6, 2015 at 1:32 PM, Kayode Odeyemi wrote:
> Hi,
>
> I have a YARN hadoop setup of 8 nodes (7 datanodes, 1 namenode and
> resourcemaneger).
2015-11-04 10:03:31,905 ERROR [Delegation Token Refresh Thread-0]
hdfs.KeyProviderCache (KeyProviderCache.java:createKeyProviderURI(87)) -
Could not find uri with key [dfs. encryption.key.provider.uri] to
create a keyProvider !!
Could it be related to HDFS-7931 ?
On Wed, Nov 4, 2015 at 12:30
Have you tried using -Dspark.master=local ?
Cheers
On Wed, Nov 4, 2015 at 10:47 AM, Kayode Odeyemi wrote:
> Hi,
>
> I can't seem to understand why all created executors always fail.
>
> I have a Spark standalone cluster setup make up of 2 workers and 1 master.
> My spark-env
s:
>
> conf.setMaster("spark://192.168.2.11:7077")
> conf.set("spark.logConf", "true")
> conf.set("spark.akka.logLifecycleEvents", "true")
> conf.set("spark.executor.memory", "5g")
>
> On Wed, Nov 4, 2015 at 9:04 PM
Are you trying to speed up tests where each test suite uses single SparkContext
?
You may want to read:
https://issues.apache.org/jira/browse/SPARK-2243
Cheers
On Wed, Nov 4, 2015 at 4:59 AM, Priya Ch
wrote:
> Hello All,
>
> How to use multiple Spark Context in
I am a bit curious: why is the synchronization on finalLock is needed ?
Thanks
> On Oct 23, 2015, at 8:25 AM, Anubhav Agarwal wrote:
>
> I have a spark job that creates 6 million rows in RDDs. I convert the RDD
> into Data-frame and write it to HDFS. Currently it takes 3
Looks like you were running 1.4.x or earlier release because the allowLocal
flag is deprecated as of Spark 1.5.0+.
Cheers
On Tue, Nov 3, 2015 at 3:07 PM, Jack Yang wrote:
> Hi all,
>
>
>
> I am saving some hive- query results into the local directory:
>
>
>
> val hdfsFilePath
Take a look at:
http://search-hadoop.com/m/q3RTtxRM5d2SLnmQ1=Re+Override+Logging+with+spark+streaming
On Tue, Nov 3, 2015 at 5:29 AM, diplomatic Guru
wrote:
> I have an issue with a Spark Streaming job that appears to be running but
> not producing any results.
sbt-interface.jar is under build/zinc-0.3.5.3/lib/sbt-interface.jar
You can run build/mvn first to download it.
Cheers
On Mon, Nov 2, 2015 at 1:51 AM, Todd wrote:
> Hi,
> I am trying to build spark 1.5.1 in my environment, but encounter the
> following error complaining
Please take a look at SPARK-2365
On Mon, Nov 2, 2015 at 3:25 PM, swetha kasireddy
wrote:
> Hi,
>
> Is Indexed RDDs released yet?
>
> Thanks,
> Swetha
>
> On Sun, Nov 1, 2015 at 1:21 AM, Gylfi wrote:
>
>> Hi.
>>
>> You may want to look into Indexed
A brief search in code base shows the following:
TODO: Add simplex constraints to allow alpha in (0,1).
./mllib/src/main/scala/org/apache/spark/mllib/clustering/LDA.scala
I guess the answer to your question is no.
FYI
On Sun, Nov 1, 2015 at 9:37 AM, Zhiliang Zhu
Which Spark release are you using ?
Which OS ?
Thanks
On Sat, Oct 31, 2015 at 5:18 AM, hotdog wrote:
> I meet a situation:
> When I use
> val a = rdd.pipe("./my_cpp_program").persist()
> a.count() // just use it to persist a
> val b = a.map(s => (s,
>From the result of http://search-hadoop.com/?q=spark+Martin+Senne ,
Martin's post Tuesday didn't go through.
FYI
On Sat, Oct 31, 2015 at 9:34 AM, Nicholas Chammas <
nicholas.cham...@gmail.com> wrote:
> Nabble is an unofficial archive of this mailing list. I don't know who
> runs it, but it's
Jone:
For #3, consider ask on vendor's mailing list.
On Fri, Oct 30, 2015 at 7:11 AM, Akhil Das
wrote:
> You can set it to MEMORY_AND_DISK, in this case data will fall back to
> disk when it exceeds the memory.
>
> Thanks
> Best Regards
>
> On Fri, Oct 23, 2015 at
How about the following ?
scala> df.registerTempTable("df")
scala> df1.registerTempTable("df1")
scala> sql("select customer_id, uri, browser, epoch from df union select
customer_id, uri, browser, epoch from df1").show()
+---+-+---+-+
|customer_id|
I searched for sportingpulse in *.scala and *.java files under 1.5 branch.
There was no hit.
mvn dependency doesn't show sportingpulse either.
Is it possible this is specific to EMR ?
Cheers
On Fri, Oct 30, 2015 at 2:57 PM, Zhang, Jingyu
wrote:
> There is not a
a:51)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:221)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:242)
>
>
> On Fri, Oct 30, 2015 at 3:34 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>
&
Which Spark release are you using ?
Please note the typo in email subject (corrected as of this reply)
On Thu, Oct 29, 2015 at 7:00 PM, Jey Kottalam wrote:
> Could you please provide the jstack output? That would help the devs
> identify the blocking operation more
MQTTUtils.class is generated from
external/mqtt/src/main/scala/org/apache/spark/streaming/mqtt/MQTTUtils.scala
What command did you use to build ?
Which release / branch were you building ?
Thanks
On Wed, Oct 28, 2015 at 6:19 AM, Bob Corsaro wrote:
> Has anyone successful
buntu boxen and a gentoo box.
>
> On Wed, Oct 28, 2015 at 9:59 AM Ted Yu <yuzhih...@gmail.com> wrote:
>
>> MQTTUtils.class is generated from
>> external/mqtt/src/main/scala/org/apache/spark/streaming/mqtt/MQTTUtils.scala
>>
>> What command did you use to build
Jinghong:
Hadmin variable is not used. You can omit that line.
Which hbase release are you using ?
As Deng said, don't flush per row.
Cheers
> On Oct 27, 2015, at 3:21 AM, Deng Ching-Mallete wrote:
>
> Hi,
>
> It would be more efficient if you configure the table and
I've ran this on both OSX Lion and Ubuntu 12. Same error. No .gz file
>
>> On Mon, Oct 26, 2015 at 9:10 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>> Looks like '-Pyarn' was missing in your command.
>>
>>> On Mon, Oct 26, 2015 at 12:06 PM, Kayode Odeyemi <drey...@gmai
/docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest'
> cp: cannot create directory
> `/home/emperor/javaprojects/spark/spark-[WARNING] See
> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest':
> No such file or directory
>
>
> On Tue, Oct 27, 2015 at 2
Jinghong:
In one of earlier threads on storing data to hbase, it was found that
htrace jar was not on classpath, leading to write failure.
Can you check whether you are facing the same problem ?
Cheers
On Tue, Oct 27, 2015 at 5:11 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> Jinghong:
ed-site.xml, yarn-default.xml,
> yarn-site.xml, hdfs-default.xml, hdfs-site.xml, hbase-default.xml,
> hbase-site.xml)
> - field (class: com.chencai.spark.ml.TrainModel3$$anonfun$train$5,
> name: configuration$1, type: class org.apache.hadoop.conf.Configuration)
> - object (class
> com.che
pport/sql/parquet_partitioned/year=2015/month=9/day=1:
> No such file or directory
> cp: /usr/local/spark-latest/spark-[WARNING] See
> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1/.part-r-7.
This is related:
SPARK-10955 Warn if dynamic allocation is enabled for Streaming jobs
which went into 1.6.0 as well.
FYI
On Mon, Oct 26, 2015 at 2:26 PM, Silvio Fiorito <
silvio.fior...@granturing.com> wrote:
> Hi Matthias,
>
> Unless there was a change in 1.5, I'm afraid dynamic resource
If you use the command shown in:
https://github.com/apache/spark/pull/9281
You should have got the following:
./dist/python/test_support/sql/parquet_partitioned/year=2014/month=9/day=1/part-r-8.gz.parquet
Scala 2.11 is supported in 1.5.1 release:
http://search.maven.org/#search%7Cga%7C1%7Ca%3A%22spark-parent_2.11%22
Can you upgrade ?
Cheers
On Mon, Oct 26, 2015 at 6:01 AM, Bryan Jeffrey
wrote:
> All,
>
> I'm seeing the following error compiling Spark 1.4.1 w/ Scala
I logged SPARK-11318 with a PR.
I verified that by adding -Phive the datanucleus jars are included:
tar tzvf spark-1.6.0-SNAPSHOT-bin-custom-spark.tgz | grep datanucleus
-rw-r--r-- hbase/hadoop 1890075 2015-10-26 09:52
spark-1.6.0-SNAPSHOT-bin-custom-spark/lib/datanucleus-core-3.2.10.jar
bq. t = new Tuple2 (entry.getKey(),
entry.getValue());
The return statement is outside the loop.
That was why you got one RDD.
On Mon, Oct 26, 2015 at 9:40 AM, Yasemin Kaya wrote:
> Hi,
>
> I have *JavaRDD>>* and I want to
>
In zipRLibraries():
// create a zip file from scratch, do not append to existing file.
val zipFile = new File(dir, name)
I guess instead of creating sparkr.zip in the same directory as R lib, the
zip file can be created under some directory writable by the user launching
the app and
A dependency couldn't be downloaded:
[INFO] +- com.h2database:h2:jar:1.4.183:test
Have you checked your network settings ?
Cheers
On Sun, Oct 25, 2015 at 10:22 AM, Bilinmek Istemiyor
wrote:
> Thank you for the quick reply. You are God Send. I have long not been
>
Have you taken a look at the fix for SPARK-11000 which is in the upcoming
1.6.0 release ?
Cheers
On Sun, Oct 25, 2015 at 8:42 AM, Yao wrote:
> I have not been able to start Spark scala shell since 1.5 as it was not
> able
> to create the sqlContext during the startup. It
If you have a pull request, Jenkins can test your change for you.
FYI
> On Oct 25, 2015, at 12:43 PM, Richard Eggert wrote:
>
> Also, if I run the Maven build on Windows or Linux without setting
> -DskipTests=true, it hangs indefinitely when it gets to
>
The code below was introduced by SPARK-7673 / PR #6225
See item #1 in the description of the PR.
Cheers
On Sat, Oct 24, 2015 at 12:59 AM, Koert Kuipers wrote:
> the code that seems to flatMap directories to all the files inside is in
> the private
Mind sharing your code, if possible ?
Thanks
On Fri, Oct 23, 2015 at 9:49 AM, crakjie wrote:
> Hello.
>
> I have activated the file checkpointing for a DStream to unleach the
> updateStateByKey.
> My unit test worked with no problem but when I have integrated this in my
> full
Take a look at first section of https://spark.apache.org/community
On Fri, Oct 23, 2015 at 1:46 PM, wrote:
> This e-mail and any files transmitted with it are for the sole use of the
> intended recipient(s) and may contain confidential and privileged
>
Can you outline your use case a bit more ?
Do you want to know all the hosts which would run the map ?
Cheers
On Fri, Oct 23, 2015 at 5:16 PM, weoccc wrote:
> in rdd map function, is there a way i can know the list of host names
> where the map runs ? any code sample would
RemoteActorRefProvider is in akka-remote_2.10-2.3.11.jar
jar tvf
~/.m2/repository/com/typesafe/akka/akka-remote_2.10/2.3.11/akka-remote_2.10-2.3.11.jar
| grep RemoteActorRefProvi
1761 Fri May 08 16:13:02 PDT 2015
akka/remote/RemoteActorRefProvider$$anonfun$5.class
1416 Fri May 08 16:13:02 PDT
RemoteActorRefProvider is in akka-remote_2.10-2.3.11.jar
jar tvf
~/.m2/repository/com/typesafe/akka/akka-remote_2.10/2.3.11/akka-remote_2.10-2.3.11.jar
| grep RemoteActorRefProvi
1761 Fri May 08 16:13:02 PDT 2015
akka/remote/RemoteActorRefProvider$$anonfun$5.class
1416 Fri May 08 16:13:02 PDT
The number of occurrences of such incidence is low.
I think currently we don't need to add the footer. I checked several other
Apache projects whose user@ I subscribe to - there is no such footer.
Cheers
On Wed, Oct 21, 2015 at 7:38 AM, Nicholas Chammas <
nicholas.cham...@gmail.com> wrote:
>
I don't think passing sqlContext to map() is supported.
Can you describe your use case in more detail ? Why do you need to create a
DataFrame inside the map() function ?
Cheers
On Wed, Oct 21, 2015 at 6:32 PM, Ajay Chander wrote:
> Hi Everyone,
>
> I have a use case where
How many regions do your table have ?
Which hbase release do you use ?
Cheers
On Tue, Oct 20, 2015 at 12:32 AM, Amit Singh Hora
wrote:
> Hi All ,
>
> My spark job started reporting zookeeper errors after seeing the zkdumps
> from Hbase master i realized that there are N
Hi,
I couldn't access the following URL (404):
http://hbase.apache.org/book.html
The above is linked from http://hbase.apache.org
Where can I find the refguide ?
Thanks
<hora.a...@gmail.com> wrote:
> One region
> ------
> From: Ted Yu <yuzhih...@gmail.com>
> Sent: 20-10-2015 15:01
> To: Amit Singh Hora <hora.a...@gmail.com>
> Cc: user <user@spark.apache.org>
> Subject: Re: Spark opening t
lt;hora.a...@gmail.com>
> Sent: 20-10-2015 20:38
> To: Ted Yu <yuzhih...@gmail.com>
> Cc: user <user@spark.apache.org>
> Subject: RE: Spark opening to many connection with zookeeper
>
> I used that also but the number of connection goes on increasing started
On my Mac:
$ ls -l ~/.m2/repository/org/antlr/antlr/3.2/antlr-3.2.jar
-rw-r--r-- 1 tyu staff 895124 Dec 17 2013
/Users/tyu/.m2/repository/org/antlr/antlr/3.2/antlr-3.2.jar
Looks like there might be network issue on your computer.
Can you check ?
Thanks
On Tue, Oct 20, 2015 at 1:21 PM,
Pete:
Please don't mix unrelated email on the back of another thread.
To unsubscribe, see first section of https://spark.apache.org/community
On Tue, Oct 20, 2015 at 2:42 PM, Pete Zybrick wrote:
>
>
>
each ID, I need to
> get the statistic information.
>
>
>
> Best
>
> Frank
>
>
>
> *From:* Ted Yu [mailto:yuzhih...@gmail.com]
> *Sent:* Tuesday, October 20, 2015 3:12 PM
> *To:* ChengBo
> *Cc:* user
> *Subject:* Re: Get statistic result from RDD
>
>
>
Your mapValues can emit a tuple. If p(0) is between 0 and 5, first
component of tuple would be 1, second being 0.
If p(0) is 6 or 7, first component of tuple would be 0, second being 1.
You can use reduceByKey to sum up corresponding component.
On Tue, Oct 20, 2015 at 1:33 PM, Shepherd
Have you tried the following options ?
--conf spark.driver.userClassPathFirst=true --conf spark.executor.
userClassPathFirst=true
Cheers
On Mon, Oct 19, 2015 at 5:07 AM, YiZhi Liu wrote:
> I'm trying to read a Thrift object from SequenceFile, using
> elephant-bird's
niBasedUnixGroupsMappingWithFallback not
> org.apache.hadoop.security.GroupMappingServiceProvider)
>
> 2015-10-19 22:23 GMT+08:00 Ted Yu <yuzhih...@gmail.com>:
> > Have you tried the following options ?
> >
> > --conf spark.driver.userClassPathFirst=true --conf
> > spark.execut
nnily enough, I have a repro that doesn't even use mysql so this seems
> to be purely a classloader issue:
>
> source: http://pastebin.com/WMCMwM6T
> 1.4.1: http://pastebin.com/x38DQY2p
> 1.5.1: http://pastebin.com/DQd6k818
>
>
>
> On Mon, Oct 19, 2015 at 11:51 AM, Te
The attachments didn't go through.
Consider pastbebin'ning.
Thanks
On Mon, Oct 19, 2015 at 11:15 AM, gbop wrote:
> I've been struggling with a particularly puzzling issue after upgrading to
> Spark 1.5.1 from Spark 1.4.1.
>
> When I use the MySQL JDBC connector and an
tUvcBerd
>
> On Mon, Oct 19, 2015 at 11:18 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> The attachments didn't go through.
>>
>> Consider pastbebin'ning.
>>
>> Thanks
>>
>> On Mon, Oct 19, 2015 at 11:15 AM, gbop <lij.ta...@gmail.com> wrote:
Under core/src/test/scala/org/apache/spark , you will find a lot of
examples for map function.
FYI
On Mon, Oct 19, 2015 at 10:35 AM, Shepherd wrote:
> Hi all, I am new in Spark and Scala. I have a question in doing
> calculation. I am using "groupBy" to generate key value
Attachments didn't go through.
Mind using pastebin to show the code / error ?
Thanks
On Mon, Oct 19, 2015 at 3:01 PM, daze5112 wrote:
> Hi having some problems with the piece of code I inherited:
>
>
>
>
> the error messages i get are:
>
>
> the code runs if i
A brief search led me
to ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java :
private static final String HDFS_SESSION_PATH_KEY =
"_hive.hdfs.session.path";
...
public static Path getHDFSSessionPath(Configuration conf) {
SessionState ss = SessionState.get();
if (ss ==
801 - 900 of 1611 matches
Mail list logo