Hello Efi,

I think that it would be good to 
a) keep simple strings as collection identifiers in the query and to 
b) support different resolution mechanisms to resolve those string to actual 
files on different file systems.
To choose/parameterize the resolution mechanism we could then have a flag on 
the CLI.

Cheers,
Till

> On Aug 18, 2015, at 9:09 AM, Efi <[email protected]> wrote:
> 
> Hello everyone,
> 
>    I would like your thoughts on something about the HDFS reads.So far when 
> you submitted a query, with the collection or document function, the system 
> would first check your local file system and if the path exists it will run 
> as normally, but if it does not exists on local, it would read from HDFS.That 
> could cause an issue when we have the same paths on both local and HDFS.
> 
>    We thought 2 ways around, one is the user will include in the path a 
> header 'file://' for local and "hdfs://" for HDFS or we could add another 
> argument that would be something like --filesystem='hdfs' for hdfs.
> 
> The first one is simpler but you cannot use relative paths,the second one 
> just adds another argument to the cli.
> 
> Which do you think would be better for us?
> 
> Thank you,
> Efi

Reply via email to