Hi Tore, 
As I suspected, there was a small bug in the config file.  Are you comfortable 
with Java?   I’m in the middle of doing some significant drill rework and 
submitting a fix would be a little difficult.  I can send you the fix however.
Best,
— C

> On Mar 6, 2026, at 15:11, Tore Van Grembergen <[email protected]> wrote:
> 
> Thanks for following up on this
> Much appreciated.
>  
> Kind regards
>  
> Tore
>  
> From: Charles Givre <[email protected]>
> Sent: Friday, 6 March 2026 21:10
> To: Tore Van Grembergen <[email protected]>
> Cc: [email protected]
> Subject: Re: quering H5 "flatten" data in apache drill in releases after 
> 1.19.0
>  
> Hi Tore, 
> I’ll look into this.  I did look at the logic and everything is still there.  
> There are current unit tests which test all this functionality and they are 
> passing.  I have a theory about what may be happening and that is that the 
> parameter may be getting dropped in the UI.
> Best,
> — C
> 
> 
> On Mar 6, 2026, at 15:01, Tore Van Grembergen <[email protected] 
> <mailto:[email protected]>> wrote:
>  
> Dear Charles,
>  
> The same happens as with "showPreview": true.
> Apache Drill  allows for saving this config. It does not give an error
> However when you look back, this parameter has disappeared.
>  
> Kind regards
>  
> Tore
>  
>  
>  
>  
> From: Charles Givre <[email protected] <mailto:[email protected]>>
> Sent: Friday, 6 March 2026 06:43
> To: Tore Van Grembergen <[email protected] <mailto:[email protected]>>
> Cc: [email protected] <mailto:[email protected]>
> Subject: Re: quering H5 "flatten" data in apache drill in releases after 
> 1.19.0
>  
> What happens when you set it to false?
> 
> 
> 
> On Mar 6, 2026, at 00:28, Tore Van Grembergen <[email protected] 
> <mailto:[email protected]>> wrote:
>  
> Dear Charles,
>  
> Thanks for coming back on this.
>  
> I tried the "showPreview": true. 
> Apache Drill  allows for saving this config. It does not give an error
> However when you look back, this parameter has disappeared.
> It is as if it is filtered out during the save of the config file.
>  
> Kind regards
>  
> Tore
>  
> From: Charles Givre <[email protected] <mailto:[email protected]>>
> Sent: Friday, 6 March 2026 03:47
> To: [email protected] <mailto:[email protected]>
> Cc: Tore Van Grembergen <[email protected] <mailto:[email protected]>>
> Subject: Re: quering H5 "flatten" data in apache drill in releases after 
> 1.19.0
>  
> Hi Tore, 
> Thanks for your interest and use of Drill.  Could you try this:
>  
> 1.  In the configuration for your dfs plugin, make sure that the config for 
> the hdf5 format is as shown below:
>  
> "hdf5": {
>   "type": "hdf5",
>   "extensions": [
>     "h5"
>   ],
>   "showPreview": true
> }
> 2.   Run a SELECT *  query on your HDF5 file and report back what the results 
> look like. 
>  
> A word about the HDF5 plugin.  The preview you are looking for is really just 
> meant to give a sample of the data.  If your data set is really large, it 
> will get truncated in that view.   Also, if I remember correctly, the name 
> “int_data” is the actual name of that column from the dataset. 
>  
> Really the better way to query your data is to use the defaultPath option.  
> This allows you to query tables within HDF5 files.  
>  
> "SELECT int_col_0, int_col_1 
> FROM table(dfs.`hdf5/scalar.h5` (type => 'hdf5', defaultPath => '/nd/3D'))"
> Best,
> — C
>  
> 
> 
> 
> 
> On Mar 5, 2026, at 15:46, Tore Van Grembergen via user <[email protected] 
> <mailto:[email protected]>> wrote:
>  
> Hi Team,
> 
> I am looking into using the apache drill capabilities for querying H5 data.
> The documentation on this as provided on the site 
> https://drill.apache.org/docs/hdf5-format-plugin/ works for version 1.19.0, 
> however not as of 1.20.0.
> The column where the actual data is mapped into seems to be no longer 
> available.
> 
> e.g. the column int_data as per below example is no longer there .
> 
> apache drill> select * from dfs.test.`dset.h5`;
> |-------|-----------|-----------|-----------|---------------|--------------|------------------|-------------------|------------|--------------------------------------------------------------------------|
> | path  | data_type | file_name | data_size | element_count | is_timestamp | 
> is_time_duration | dataset_data_type | dimensions | int_data                  
>                                                |
> |-------|-----------|-----------|-----------|---------------|--------------|------------------|-------------------|------------|--------------------------------------------------------------------------|
> | /dset | DATASET   | dset.h5   | 96        | 24            | false        | 
> false            | INTEGER           | [4, 6]     | 
> [[1,2,3,4,5,6],[7,8,9,10,11,12],[13,14,15,16,17,18],[19,20,21,22,23,24]] |
> |-------|-----------|-----------|-----------|---------------|--------------|------------------|-------------------|------------|--------------------------------------------------------------------------|
> 
> 
> I have read somewhere that a parameter in the workspace definition 
> "showPreview" : true should restore the original way of working, however when 
> trying to save this parameter, it is automagically removed.
> (remark : the environment is running the apache/drill image in a docker 
> container, the config is stored on a mounted drive)
> 
> The reason for needing this int_data, double_data column is that there are a 
> lot of times too many values in and it is not known upfront  how many values 
> will be in the field.
> Hence the "column" approach in the select * from table(xyz) is not workable.
> It is necessary to be able to do  e.g. select flatten(int_data) as int_data 
> from dfs.test.dset.h5;
> 
> Is there a way to get this (re)-activated in apache dril 1.22 and successors ?
> 
> All help is much appreciated.
> 
> Kind regards
> 
> Tore
> 

Attachment: signature.asc
Description: Message signed with OpenPGP

Reply via email to