What happens when you set it to false? > On Mar 6, 2026, at 00:28, Tore Van Grembergen <[email protected]> wrote: > > Dear Charles, > > Thanks for coming back on this. > > I tried the "showPreview": true. > Apache Drill allows for saving this config. It does not give an error > However when you look back, this parameter has disappeared. > It is as if it is filtered out during the save of the config file. > > Kind regards > > Tore > > From: Charles Givre <[email protected]> > Sent: Friday, 6 March 2026 03:47 > To: [email protected] > Cc: Tore Van Grembergen <[email protected]> > Subject: Re: quering H5 "flatten" data in apache drill in releases after > 1.19.0 > > Hi Tore, > Thanks for your interest and use of Drill. Could you try this: > > 1. In the configuration for your dfs plugin, make sure that the config for > the hdf5 format is as shown below: > > "hdf5": { > "type": "hdf5", > "extensions": [ > "h5" > ], > "showPreview": true > } > 2. Run a SELECT * query on your HDF5 file and report back what the results > look like. > > A word about the HDF5 plugin. The preview you are looking for is really just > meant to give a sample of the data. If your data set is really large, it > will get truncated in that view. Also, if I remember correctly, the name > “int_data” is the actual name of that column from the dataset. > > Really the better way to query your data is to use the defaultPath option. > This allows you to query tables within HDF5 files. > > "SELECT int_col_0, int_col_1 > FROM table(dfs.`hdf5/scalar.h5` (type => 'hdf5', defaultPath => '/nd/3D'))" > Best, > — C > > > > On Mar 5, 2026, at 15:46, Tore Van Grembergen via user <[email protected] > <mailto:[email protected]>> wrote: > > Hi Team, > > I am looking into using the apache drill capabilities for querying H5 data. > The documentation on this as provided on the site > https://drill.apache.org/docs/hdf5-format-plugin/ works for version 1.19.0, > however not as of 1.20.0. > The column where the actual data is mapped into seems to be no longer > available. > > e.g. the column int_data as per below example is no longer there . > > apache drill> select * from dfs.test.`dset.h5`; > |-------|-----------|-----------|-----------|---------------|--------------|------------------|-------------------|------------|--------------------------------------------------------------------------| > | path | data_type | file_name | data_size | element_count | is_timestamp | > is_time_duration | dataset_data_type | dimensions | int_data > | > |-------|-----------|-----------|-----------|---------------|--------------|------------------|-------------------|------------|--------------------------------------------------------------------------| > | /dset | DATASET | dset.h5 | 96 | 24 | false | > false | INTEGER | [4, 6] | > [[1,2,3,4,5,6],[7,8,9,10,11,12],[13,14,15,16,17,18],[19,20,21,22,23,24]] | > |-------|-----------|-----------|-----------|---------------|--------------|------------------|-------------------|------------|--------------------------------------------------------------------------| > > > I have read somewhere that a parameter in the workspace definition > "showPreview" : true should restore the original way of working, however when > trying to save this parameter, it is automagically removed. > (remark : the environment is running the apache/drill image in a docker > container, the config is stored on a mounted drive) > > The reason for needing this int_data, double_data column is that there are a > lot of times too many values in and it is not known upfront how many values > will be in the field. > Hence the "column" approach in the select * from table(xyz) is not workable. > It is necessary to be able to do e.g. select flatten(int_data) as int_data > from dfs.test.dset.h5; > > Is there a way to get this (re)-activated in apache dril 1.22 and successors ? > > All help is much appreciated. > > Kind regards > > Tore >
signature.asc
Description: Message signed with OpenPGP
