Hi Josh,
Thanks - that worked. Did not try Som's method, but that would probably have worked as well.
Best.

On 10/07/2014 9:01 PM, Josh Wills wrote:
Hey Samik,

Glob syntax should work in Crunch as well:

Pipeline p = …;
PCollection<MyAvroRecords> = p.read(From.avroFile('/raw/idm/events/year=2014/month=04/day=07/*/*/*.avro', Avros.specifics(MyAvroRecords.class)));

J


On Thu, Jul 10, 2014 at 8:18 AM, Som Satpathy <[email protected] <mailto:[email protected]>> wrote:

    Hi Samik,

    You can create an AvroFileSource using org.apache.crunch.io.avro's
    AvroFileSource(List<Path> paths, AvroType<T> ptype) API, then read
    source in the pipeline.

    Hope this helps.

    Thanks,
    Som


    On Thu, Jul 10, 2014 at 2:12 AM, Samik Raychaudhuri
    <[email protected] <mailto:[email protected]>> wrote:

        Hi,

        I am a Crunch newbie trying out few things. I have a quick
        question inspired by a pig syntax. The following glob-like
        syntax works in pig for loading multiple avro files:

        A = LOAD
        '/raw/idm/events/year=2014/month=04/day=07/*/*/*.avro' using
        LOAD_IDM;

        I am wondering if there is something similar in Crunch API
        that would do this.

        Regards.




Reply via email to