Directory level separation is sufficient as the Drill workspace is defined at directory level. It is not needed to volume level separation for Drill and file types (there may be other reasons why you would need volume level separation)
If possible it is probably best to keep to a specific file type in each Drill workspace (base directory), as it makes it easier to understand the data in each workspace. Once you mix file types in a workspace it can potentially get a bit trickier. Technically you can get Drill to work with different file types in the same directory if you point to file level with queries, but it can cause user and other issues in the future. —Andries Others may have different views On Feb 23, 2015, at 12:53 PM, Chad Smykay <[email protected]> wrote: > Is the safest configuration for multiple file types (i.e., JSON, CSV, TSV) to > put them in separate HDFS folders or MapR volumes? By safest I mean the > ability to query the data types without running into to many bug's/issues. > For example > > /VolumeA/TSVdir > /VolumeA/CSVdir > /VolumeA/JSONdir > > And just specify in the query to look at all *.tsv in the "TSVdir" > > Or is > /VolumeA_TSVdir > /VolumeB_CSVdir > /VolumeC_JSONdir > > The best way? > -- > Kind Regards, > Chad Smykay | Solutions Architect | M: 210.273.2344 > <MapR-TrdMk_logo_012714_red_rgb_100x23.png> mapr.com >
