Hi Josh,
Thanks - that worked. Did not try Som's method, but that would probably
have worked as well.
Best.
On 10/07/2014 9:01 PM, Josh Wills wrote:
Hey Samik,
Glob syntax should work in Crunch as well:
Pipeline p = …;
PCollection<MyAvroRecords> =
p.read(From.avroFile('/raw/idm/events/year=2014/month=04/day=07/*/*/*.avro',
Avros.specifics(MyAvroRecords.class)));
J
On Thu, Jul 10, 2014 at 8:18 AM, Som Satpathy <[email protected]
<mailto:[email protected]>> wrote:
Hi Samik,
You can create an AvroFileSource using org.apache.crunch.io.avro's
AvroFileSource(List<Path> paths, AvroType<T> ptype) API, then read
source in the pipeline.
Hope this helps.
Thanks,
Som
On Thu, Jul 10, 2014 at 2:12 AM, Samik Raychaudhuri
<[email protected] <mailto:[email protected]>> wrote:
Hi,
I am a Crunch newbie trying out few things. I have a quick
question inspired by a pig syntax. The following glob-like
syntax works in pig for loading multiple avro files:
A = LOAD
'/raw/idm/events/year=2014/month=04/day=07/*/*/*.avro' using
LOAD_IDM;
I am wondering if there is something similar in Crunch API
that would do this.
Regards.