+1 Steven On Tue, Aug 18, 2015 at 9:59 AM, Till Westmann <[email protected]> wrote:
> Hello Efi, > > I think that it would be good to > a) keep simple strings as collection identifiers in the query and to > b) support different resolution mechanisms to resolve those string to > actual files on different file systems. > To choose/parameterize the resolution mechanism we could then have a flag > on the CLI. > > Cheers, > Till > > > On Aug 18, 2015, at 9:09 AM, Efi <[email protected]> wrote: > > > > Hello everyone, > > > > I would like your thoughts on something about the HDFS reads.So far > when you submitted a query, with the collection or document function, the > system would first check your local file system and if the path exists it > will run as normally, but if it does not exists on local, it would read > from HDFS.That could cause an issue when we have the same paths on both > local and HDFS. > > > > We thought 2 ways around, one is the user will include in the path a > header 'file://' for local and "hdfs://" for HDFS or we could add another > argument that would be something like --filesystem='hdfs' for hdfs. > > > > The first one is simpler but you cannot use relative paths,the second > one just adds another argument to the cli. > > > > Which do you think would be better for us? > > > > Thank you, > > Efi > >
