Hello All
Additional question on this subject , is there a way to merge content based
on temporal window. The attributeRollingWindows does not help here.
This can allow in my context to build an aggregation layer ( it’s for Telemetry
data which are coming in at different rate and I need to normalize/aggregate
those data ) , the flow may be like this :
Receive telemetry data
Merge content based on the type of data and a temporal windows
Aggregate using QueryRecord to aggregate the bulk of data :
Normally this should be effective as it’s done per bulk .
Then stream the result out ( backend / Mom … )
Of course all the aggregation should dynamic by merging and generating the
query based on attributes qualifying the type of the data and the aggregation
which need to be done.
Additional question : If I understand correctly , QueryRecord is based on Drill
, and Drill allow to automatically infer the schema from JSON File. Is there a
way to use this feature without going thru the SchemaRepository ?
Thanks in advance.
Thierry Hanot
From: James McMahon [mailto:[email protected]]
Sent: 20 July 2017 14:04
To: [email protected]
Subject: Re: MergeContent Inquiry
Outstanding. Thank you very much Joe.
On Thu, Jul 20, 2017 at 8:00 AM, Joe Witt
<[email protected]<mailto:[email protected]>> wrote:
Yep. Very common. Set the desired size or number of object targets
and set the 'Max Bin Age' so that it will kick out whatever you've got
by that time.
On Thu, Jul 20, 2017 at 7:38 AM, James McMahon
<[email protected]<mailto:[email protected]>> wrote:
> Good morning. I have a situation where I have a staging directory into which
> may be dropped a small number or a large multitude of files. My customer
> wants me to package these up - but in a size range. I see that MergeContent
> allows me to set a MinimumGroupSize and a MaximumGroupSize.
>
> If all the files total less than the MinimumGroupSize in MB, would
> MergeContent take no action until enough files arrived to cross the minimum
> threshold - ie, would it just sit and wait? Is it possible to combine the
> size thresholds with a time parameter so that if X time passes and no new
> files appear, the package is created despite falling short of the minimum
> size threshold?
>
> Thanks in advance once again for any insights. -Jim