Patrick,

Can you please open an issue for this? I think we should fix this before
the 1.9.0 release. Thanks!

rb

On Tue, Oct 4, 2016 at 11:58 AM, Patrick Woody <[email protected]>
wrote:

> Looking a bit more - it looks like this is because decompression converts
> to a StreamBytesInput automatically. The current tests run with the
> uncompressed codec, so it doesn't hit this issue. I've put up a commit here
> that demonstrates the issue and my current workaround:
> https://github.com/palantir/parquet-mr/pull/10/commits/
> 70cc00cba5c294d4c860bd4cd2c48c2d083a5809
> .
>
> Thanks,
> Patrick
>
> On Tue, Oct 4, 2016 at 4:33 PM, Patrick Woody <[email protected]>
> wrote:
>
> > Hey all,
> >
> > Running a parquet-mr build off of master and I'm seeing some interesting
> > behavior when using a DictionaryFilter to prune row groups. Basically,
> if I
> > have an And or Or filter the DictionaryPage object gets re-used. This
> seems
> > to be a problem for StreamBytesInput because the stream gets exhausted
> > after the first toByteArray call. My current workaround is to synchronize
> > and just re-use the byte array after the first read, but I'd be curious
> as
> > to what people think the best approach to solving this is and if we
> should
> > be reusing the BytesInput at all.
> >
> > Best,
> > Patrick
> >
>



-- 
Ryan Blue
Software Engineer
Netflix

Reply via email to