Hello,

I have a repository of files relatively well organised and containing a mix of medical images and csv files produced from those images in a neuroscience lab.

The csv files contain some interesting data that I would like to aggregate with Drill, but the naming convention is quite special - file names contain some id, then a prefix or suffix to identify the category of the file and all that is nested into a folder structure organised by subjects, for example ID1/processing1/ID1-mx.csv.

How can I use Drill to filter out the files that I do not need and keep only the files containing my data?

For example, I would like to write something like

SELECT * FROM dfs.data.`/` where dir1 = "processing1" and file like "%-mx.csv";


Thanks




Reply via email to