I can’t remember for sure but I think if you use CompressContent the compressed file has to fit in a single HDFS file block in order to work. IIRC Hadoop-Snappy is different from regular Snappy in the sense that it puts the compression header in each block so the file can be reassembled and decompressed correctly.
> On Nov 11, 2019, at 10:30 AM, Shawn Weeks <[email protected]> wrote: > > > I’m assuming your talking about the snappy problem. If you use compress > content prior to puthdfs you can compress with Snappy as it uses the Java > Native Snappy Lib. The HDFS processors are limited to the actual Hadoop > Libraries so they’d have to change from Native to get around this. I’m pretty > sure we need instance loading to handle the other issues mentioned. > > Thanks > Shawn > > From: Joe Witt <[email protected]> > Reply-To: "[email protected]" <[email protected]> > Date: Monday, November 11, 2019 at 8:56 AM > To: "[email protected]" <[email protected]> > Subject: Re: Influence about removing RequiresInstanceClassLoading from > AbstractHadoopProcessor processor > > Peter > > The most common challenge is if two isolated instances both want to use a > native lib. No two native libs with the same name can be in the same jvm. > We need to solve that for sure. > > Thanks > > On Mon, Nov 11, 2019 at 9:53 AM Peter Turcsanyi <[email protected]> wrote: > Hi Hai Luo, > > @RequiresInstanceClassLoading makes possible to configure separate / isolated > "Additional Classpath Resources" settings on your HDFS processors (eg. S3 > storage driver on one of your PutHDFS and Azure Blob on the other). > > Is there any specific reason / use case why you are considering to remove it? > > Regards, > Peter Turcsanyi > > On Mon, Nov 11, 2019 at 3:30 PM abellnotring <[email protected]> wrote: > Hi,all > I’m considering removing the RequiresInstanceClassLoading annotation > from class AbstractHadoopProcessor, > Does anybody know the potential Influence? > > Thanks > By Hai Luo
