Thanks Jason! Nice job putting in detailed notes! Tim
On Tue, Oct 22, 2013 at 10:03 AM, Michael Hausenblas < [email protected]> wrote: > > > Here are the notes from todays hangout. Michael, can you copy them into > the google doc? > > > Thanks & done. > > Cheers, > Michael > > -- > Michael Hausenblas > Ireland, Europe > http://mhausenblas.info/ > > On 22 Oct 2013, at 17:49, Jason Altekruse <[email protected]> > wrote: > > > Hello All, > > > > Here are the notes from todays hangout. Michael, can you copy them into > the > > google doc? > > > > participants: Jacques, Micheal hausenblas, Lisen Mu, Yash Sharma, > Jinfeng, > > Jason Altekruse, Harri, Steven Phillips, Timothy Chen, Julien Hyde > > > > New employee at MapR: Jinfeng > > - couple more in the next month > > > > Jacques: > > - merged limit > > - clarify VVs > > - never access internal state of VV when it is invalid > > - release notes > > > > Steven: > > - ordered partitioner > > - abstract out distributed cache interface > > - continue to work on spooling to disk > > Jason: > > -semi-blocking > > - look at sort and ordered hash partitioner > > > > Yash > > - name of functions > > - separate class for operators and functions for more clarity > > - different operators have their own class files > > > > Lisen > > - fork of Drill > > - data pushed form leaves rather than pulled from root > > - we have been thinking about this same problem > > - don't want to wait for IO all the time > > - pre-fetch rather than push > > - in a join you might get pushed a huge amount of data when > you > > aren't ready for it > > - stream processing > > - alternative concept around foreman > > - not quite right for streams > > - resource allocation > > - not as much for resource requirements > > -HyperLogLog > > - space saving > > - acceptable - not precise > > - data assembly - business logic > > - approximations will be important to drill > > - no serious thinking about sampling > > - certain types of scanners should support sampling > > - hard with some without reading all data anyway > > - Hbase might be easier to do a scan > > - doing it with their own business logic and statistics > > - hard to generalize > > > > Hari > > - not much for updates > > - pick up with amazon ec2 docs > > - had problem where we need 8 gigs > > - cannot get it running on free micro instance > > - got it working removing the direct memory flag in POM > > - tim - out of memory exception right away > > - was this with or without changing the option for direct > > memory? > > > > Tim > > - wir patch in > > - amp labs big data benchmark > > - having numbers for performance evaluation > > - set up on their repo for drill datasets > > - installing HDFS to all of the nodes > > - doesn't look to complicated > > - cannot submit sql in distributed mode because of bad optimizer > > - recent review board patches > > - describe code more completely > > - hard to review without docs > > - Julien - single powerpoint slide per operator > > - google doc? like the logical plan doc > > > > > > Ben > > - code gen portion of merging receiver > > - no blockers > > - getting to code review soon > > > > Julian > > - joined hortonworks > > - working on optiq > > - helping hive, but also working on Drill > > - making optiq everything it can be > > - splitting JDBC into thin client > > - thinking about it, no implementation yet > > - right now pushing sorts down to Mongo > > - jacques - session next week on JDBC? > > - roadmap on optiq > > - commit logs tell some of the story > > - roadmap would be helpful > > - will put out call for optiq users like drill > > - put together feature list for next release(s) > > - next 6 months, want to be agile, but wants to be more > predictable > > - Jinfeng will be working with optimizer and optiq > >
