Re: HDFS Compression

Nick Allen Tue, 11 Oct 2016 10:01:23 -0700

I don't think we put much thought into how exactly the data should be
landed in HDFS and for what use cases.  It just has not been a priority.


That being said, this might be a good time to gather everyone's thoughts on
how they would use that kind of data and for what purposes.



On Tue, Oct 11, 2016 at 12:11 PM, Owen O'Malley <[email protected]> wrote:

> Be careful of using compressed JSON, since it isn't splittable. JSON is
> also very slow for reading.
>
> .. Owen
>
> On Tue, Oct 11, 2016 at 4:31 AM, Casey Stella <[email protected]> wrote:
>
> > I'd also tack on to this that the configuration for the hdfs writer
> should
> > be moved to zookeeper rather than done in flux, IMO
> > On Tue, Oct 11, 2016 at 07:20 Otto Fowler <[email protected]>
> wrote:
> >
> > > The storage format and retrieval from that format should be
> configurable,
> > > that is a ‘boundary’ for Metron so to speak.
> > >
> > > On October 10, 2016 at 16:15:12, [email protected] ([email protected])
> > > wrote:
> > >
> > > Is there a specific reason why the JSON files stored in HDFS are not
> > > compressed? I looked for some related JIRAs and mail conversations but
> > > couldn't find this already mentioned. I'm wondering if there was a good
> > > enough of an argument to keep things uncompressed, or if the subject
> just
> > > hadn't been broached yet.
> > >
> > > Jon
> > > --
> > >
> > > Jon
> > >
> >
>



-- 
Nick Allen <[email protected]>

Re: HDFS Compression

Reply via email to