Re: share/reuse off-heap persisted (tachyon) RDD in SparkContext or saveAsParquetFile on tachyon in SQLContext

2014-08-11 Thread Haoyuan Li
the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- Haoyuan Li AMPLab, EECS, UC Berkeley http

First Bay Area Tachyon meetup: August 25th, hosted by Yahoo! (Limited Space)

2014-08-19 Thread Haoyuan Li
Hi folks, We've posted the first Tachyon meetup, which will be on August 25th and is hosted by Yahoo! (Limited Space): http://www.meetup.com/Tachyon/events/200387252/ . Hope to see you there! Best, Haoyuan -- Haoyuan Li AMPLab, EECS, UC Berkeley http://www.cs.berkeley.edu/~haoyuan/

Re: Saving very large data sets as Parquet on S3

2014-10-24 Thread Haoyuan Li
-in-parquet-format-on-s3 + http://stackoverflow.com/questions/26321947/multipart-uploads-to-amazon-s3-from-apache-spark + http://stackoverflow.com/questions/26291165/spark-sql-unable-to-complete-writing-parquet-data-with-a-large-number-of-shards thanks Daniel -- Haoyuan Li AMPLab, EECS, UC

Re: Persist kafka streams to text file, tachyon error?

2014-11-22 Thread Haoyuan Li
at the code. It has the following part. Is that a problem? .persist(StorageLevel.OFF_HEAP) Any advice? Thank you! J -- Haoyuan Li AMPLab, EECS, UC Berkeley http://www.cs.berkeley.edu/~haoyuan/

Re: Spark or Tachyon: capture data lineage

2015-01-02 Thread Haoyuan Li
)-C-E. Is this something already possible with spark/tachyon? If not, do you think it is possible? Does anyone mind to share their experience in capturing the data lineage in a data processing pipeline? Best Regards, Jerry -- Haoyuan Li AMPLab, EECS, UC Berkeley http://www.cs.berkeley.edu

Re: deployment of spark on mesos and data locality in tachyon/hdfs

2015-03-31 Thread Haoyuan Li
+TPIqbA0Ttue7PeXpSrbA9+pYiNT4R/wAneMvmpTABuR4= =8ijP -END PGP SIGNATURE- - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- Haoyuan Li AMPLab, EECS, UC

Re: deployment of spark on mesos and data locality in tachyon/hdfs

2015-04-01 Thread Haoyuan Li
configuration to talk to s3 or the hdfs datanode) and the mesos slave ?process. Is this correct? -- --Sean -- Haoyuan Li AMPLab, EECS, UC Berkeley http://www.cs.berkeley.edu/~haoyuan/

Re: StorageLevel: OFF_HEAP

2015-03-18 Thread Haoyuan Li
? On a different note, I was wondering if Tachyon has been used in a production environment by anybody in this group? Appreciate your help with this. - Ranga -- Haoyuan Li AMPLab, EECS, UC Berkeley http://www.cs.berkeley.edu/~haoyuan/

Fast big data analytics with Spark on Tachyon in Baidu

2015-05-12 Thread Haoyuan Li
Dear all, We’re organizing a meetup http://www.meetup.com/Tachyon/events/222485713/ on May 28th at IBM in Forster City that might be of interest to the Spark community. The focus is a production use case of Spark and Tachyon at Baidu. You can sign up here:

Re: How to stop making Multiple copies in memory when running multiple Spark jobs?

2015-07-05 Thread Haoyuan Li
You can also find more info here: http://tachyon-project.org/master/Running-Spark-on-Tachyon.html Hope this helps. Haoyuan On Tue, Jun 30, 2015 at 11:28 PM, Himanshu Mehra himanshumehra@gmail.com wrote: Hi neprasad, You should give a try to Tachyon system. or any other in memory db.

Re: How to keep RDDs in memory between two different batch jobs?

2015-07-22 Thread Haoyuan Li
. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- Haoyuan Li CEO, Tachyon Nexus http://www.tachyonnexus.com/

Re: Writing files to s3 with out temporary directory

2017-11-22 Thread Haoyuan Li
This blog / tutorial maybe helpful to run Spark in the Cloud with Alluxio. Best regards, Haoyuan On Mon, Nov 20, 2017 at 2:12 PM, lucas.g...@gmail.com wrote: > That sounds like allot of work and if I

Re: Spark LOCAL mode and external jar (extraClassPath)

2018-04-12 Thread Haoyuan Li
This link should be helpful: https://alluxio.org/docs/1.7/en/Running-Spark-on-Alluxio.html Best regards, Haoyuan (HY) alluxio.com | alluxio.org | powered by Alluxio On Thu, Apr 12, 2018 at 6:32 PM, jb44