Re: Parquet Writer Operator

2016-06-17 Thread Devendra Tagare
Hi, I will try this approach with a prototype and get back. Thanks, Dev On Thu, Jun 16, 2016 at 10:04 PM, Chandni Singh wrote: > Dev, > > The FileSystemWalWritter closes the temporary as soon as it gets rotated. > It renames (finalizes) the temporary file to the

Re: Parquet Writer Operator

2016-06-16 Thread Chandni Singh
Dev, The FileSystemWalWritter closes the temporary as soon as it gets rotated. It renames (finalizes) the temporary file to the actual file until the window is committed. The mapping of temporary file to actual file is present in the checkpointed state. The FileSystemWalReader reads from the

Re: Parquet Writer Operator

2016-06-16 Thread Devendra Tagare
Hi, WAL based approach : The FileSystemWAL.FileSystemWALWriter closes a temporary file only after the window is committed.We cannot read any such files till this point. Once this file is committed, in the same committed callback the ParquetOutputOperator will have to read the committed files,

Re: Parquet Writer Operator

2016-06-15 Thread Thomas Weise
Hi Dev, Can you not use the existing WAL implementation (via WindowDataManager or directly)? Thomas On Wed, Jun 15, 2016 at 3:47 PM, Devendra Tagare wrote: > Hi, > > Initial thoughts were to go for a WAL based approach where the operator > would first write POJO's

Re: Parquet Writer Operator

2016-06-14 Thread Devendra Tagare
Hi All, We can focus on the below 2 problems, 1.Avoid the small files problem which could arise due a flush at every endWindow, since there wouldn't be significant data in a window. 2.Fault Tolerance. *Proposal* : Create a module in which there are 2 operators, *Operator 1 :