Notes: Attendance: - Ajay: (USA ET). here to listen and learn. Has been using storage formats at work. - Kirils: (Europe) memory alignment in Arrow. corresponding PR for Netty. - Uwe: (Europe) ready to make a 0.2 release in the next 2 weeks - Wes: (USA ET) 2sigma in NY. Working on C++/Python components. ready for 0.2 as well. Worked with Nong on the streaming formats with integration tests. with Uwe on Arrow-Parquet integration. Multi-threaded parquet reads etc. thread safe work. Spark-13534: convert from Spark datasets to arrow (file based) => spark summit Boston. Great speedups. Need to ship a release to get it merged. - Julien: (USA PT) Dremio in CA. discussed streaming with Nong, release 0.2
- Memory alignment (ARROW-186, PR#98): - Sometimes allocates too much memory. - Netty PR: https://github.com/netty/netty/pull/6293 - need to find out when the next netty release comes out. - optional for 0.2 arrow release - 0.2 release (ARROW-353): - see blocker on that jira - Spark-13534 depends on an Arrow release - some code cleanup JIRAs - integration test for binary data - other units for timestamps in java. - (optionally) c++: api for slicing arrays with 0 copy: adding an offset member in the array - jemalloc for memory - Julien to create a lira for some java api improvements. - goal: close or move over JIRAS by end of next week. Friday 2/10 and make the release - Uwe: release manager for 0.2 (will be the first release in pip python package manager). - 0.3 - integration tests for timestamps On Thu, Feb 2, 2017 at 10:00 AM, Julien Le Dem <jul...@dremio.com> wrote: > The arrow sync is starting now: > https://plus.google.com/hangouts/_/dremio.com/arrow > > On Thu, Feb 2, 2017 at 8:38 AM, Julien Le Dem <jul...@dremio.com> wrote: > >> (I just sent this to the Parquet list but this applies to Arrow as well) >> Everybody interested is welcome. >> If there is more than one of you in the same location I'd recommend >> sharing the connection. >> The sync is every other week, lasts one hour and goes as follows: >> - go around the "table" for everyone to quickly introduce themselves and >> state the agenda items they'd want discussed (if any). It could be letting >> others know of what they're planning to work on, helping reaching a >> consensus on a JIRA, reminding people to review something that's important >> to them... >> - once the agenda is built from this first round we go over each item in >> order. >> - at the end notes are sent to the list. They usually have a list of >> action items (follow up on jira, review PR #x, ...) and >> resolved/unresolved discussion points. >> >> Generally, discussions happen on the mailing list, JIRA or github PRs and >> the sync helps getting those to conclusion faster. >> >> On Thu, Feb 2, 2017 at 8:36 AM, Julien Le Dem <jul...@dremio.com> wrote: >> >>> Reminder that the next Arrow sync is today at 10am PT (in 1 hour 25 min) >>> on google hangout: >>> https://plus.google.com/hangouts/_/dremio.com/arrow >>> >>> On Thu, Jan 26, 2017 at 4:00 PM, Julien Le Dem <jul...@dremio.com> >>> wrote: >>> >>>> The next Arrow sync will be Thursday February 2nd 10am PT on google >>>> hangout >>>> https://plus.google.com/hangouts/_/dremio.com/arrow >>>> notes will be posted to the list >>>> >>>> -- >>>> Julien >>>> >>> >>> >>> >>> -- >>> Julien >>> >> >> >> >> -- >> Julien >> > > > > -- > Julien > -- Julien