Hi Julien,
My time zone is CST (China Standard Time) in GMT+8. Both 7/12 and 7/14 work for 
me.
The following time (San Francisco Time) work for me.
       3:30 PM to 9:00 PM 
And now I am working on Hive part tracked in 
https://issues.apache.org/jira/browse/HIVE-8128  
Besides, will you use Hangout or other way to host this event? My hangout 
account is [email protected] BTW.

Yours,
Ferd

-----Original Message-----
From: Julien Le Dem [mailto:[email protected]] 
Sent: Wednesday, July 06, 2016 12:49 AM
To: [email protected]
Subject: Re: Parquet Vectorized Read hackathon

This will be based on what participants want to work on :)
We’d want to share ideas and existing implementations.

Nezih’s patch is one of them done in the context of Presto.
There are some vectorized readers in SparkSQL as well
For example this one:
https://github.com/apache/spark/blob/d6dc12ef0146ae409834c78737c116050961f350/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedRleValuesReader.java
 
<https://github.com/apache/spark/blob/d6dc12ef0146ae409834c78737c116050961f350/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedRleValuesReader.java>
And others in the same package:
https://github.com/apache/spark/tree/158af162eac7348464c6751c8acd48fc6c117688/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet
 
<https://github.com/apache/spark/tree/158af162eac7348464c6751c8acd48fc6c117688/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet>
Hong started this effort.

What time zone are you in?


> On Jul 4, 2016, at 7:57 AM, Xu, Cheng A <[email protected]> wrote:
> 
> Hi Julien,
> I am working on Hive Parquet Vectorization implementations. And I am based on 
> the existing PR from Nezih https://github.com/apache/parquet-mr/pull/257. 
> Will this hackathon be based on that implementation as well or plan a new 
> shared one?
> 
> Thanks
> Ferd
> 
> -----Original Message-----
> From: Julien Le Dem [mailto:[email protected]] 
> Sent: Saturday, July 2, 2016 7:01 AM
> To: [email protected]; nezih yigitbasi <[email protected]>; 
> Daniel Weeks <[email protected]>; Ryan Blue <[email protected]>; Steven 
> Phillips <[email protected]>; Nong Li <[email protected]>; Alex Levenson 
> <[email protected]>; [email protected]; Wes McKinney 
> <[email protected]>; [email protected]; [email protected]; Jacques 
> Nadeau <[email protected]>
> Subject: Parquet Vectorized Read hackathon
> 
> Dear Parquet dev list,
> There have been efforts in several projects for vectorized reads of Parquet.
> We had discussed during the Parquet sync up to organize a hackathon to 
> brainstorm and look into a shared implementation.
> Some projects that would benefit:
> - Apache Drill
> - Apache Arrow
> - Apache Spark
> - Presto
> - Apache Hive
> 
> I'm planning to organize this at the Dremio office in Mountain View with 
> optionally a hangout for people who would want to join remotely.
> I'm adding to the "to:" people that have expressed interest or could be 
> interested but that's not an exhaustive list. Please respond to this email if 
> you wish to be included.
> Who's interested and what dates would work between this Tuesday 7/5 and 
> Wednesday 7/20 ?
> 
> --
> Julien

Reply via email to