[ https://issues.apache.org/jira/browse/CRUNCH-668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16418653#comment-16418653 ]
Clément MATHIEU commented on CRUNCH-668: ---------------------------------------- Patch updated. It restores the ability to pass a file as path. New logic mimics what {{SourceTargetHelper#getPathSize}} does. My understanding is that it is what Crunch aims to support but a careful review is welcome as it seems easy to get it wrong. I also spotted a few places where globs are not supported. For example, passing a glob to a Source and materializing the resulting PCol fails while adding an intermediate identity DoFn makes it work. Unfortunately, I don't have time to fix them as they are not on my critical path. > From.avroFile do not support globbing patterns (GenericData based overloads) > ---------------------------------------------------------------------------- > > Key: CRUNCH-668 > URL: https://issues.apache.org/jira/browse/CRUNCH-668 > Project: Crunch > Issue Type: Improvement > Components: Core > Affects Versions: 0.15.0 > Reporter: Clément MATHIEU > Assignee: Josh Wills > Priority: Major > Attachments: > 0001-CRUNCH-668-Support-globbing-patterns-in-From-avroFil-v2.patch, > 0001-CRUNCH-668-Support-globbing-patterns-in-From-avroFil.patch > > > GenericData based overloads of {{From.avroFile}} throws a RuntimeException > when a globbing pattern is provided. I see no reason to not support globbing > patterns here as it works fine with {{textFile}} and SpecificData based > overloads. > The issue is that the code extracting Avro schema from the first file use > {{listStatus}} rather than {{globStatus}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)