Hi Giorgio, Thanks for your questions, and I'd like to answer the ones about MetadataIndexNode.
> Is this the layout you are looking for? Yes, each node is composed of a few entries, and each entry is a tuple. > I mean do you use later MetadataIndexNode somewhere in the server? No, this structure is only used in MetadataIndexTree. > Is there here space to innovate with hat tries instead of using a list? Actually, I think this is worth trying. We are also thinking about other structures to construct the index tree. Thanks for your questions again! -- Zesong Sun School of Software, Tsinghua University 孙泽嵩 清华大学 软件学院 > -----Original Messages----- > From: "Giorgio Zoppi" <[email protected]> > Sent Time: 2021-10-02 04:33:51 (Saturday) > To: dev <[email protected]> > Cc: > Subject: MetaData and other questions > > Hello IOTDBers, > I am slowly making progress. > > // List of <name, offset,="" childmetadataindextype=""> > private MetadataIndexNode metadataIndex; > > I am learning again, > You model the index with a MetadataIndexNode, which is essentially a list > of entries. > MetadataIndexEntry, that it's a pair/tuple (here using guava will help). > Looks like this: > <name, offset,="" indextype="">, <name, offset,="" indextype="">, <name, offset,=""> indexType>. > Questions: > > - Is this the layout you are looking for? Do i understand well? > - Is there here space to innovate with hat tries ( > https://github.com/tpn/pdfs/blob/master/HAT-trie%20-%20A%20Cache-conscious%20Trie-based%20Data%20Structure%20for%20Strings%20-%202007%20(CRPITV62Askitis).pdf) > instead of using a list? > What do you think? I mean do you use later MetadataIndexNode somewhere in > the server? > > The next step might be having a sample this: > > /* > * on windows you may want use <format> > */ > #include <fmt core.h=""> > > using iotdb::tsfile::FileWriter; > using iotdb::tsfile::TSRecord; > using iotdb::tsfile::DataPoint; > > int main(int argc, char** argv) { > try { > FileWriter writer("/tmp/newfile.tsfile"); > if (fp.Ok()) { > TSRecord record{10000, "d1"}; > record.add(DataPoint<float>("s1", 5.0f)); > record.add(DataPoint<int>("s2", 5)); > auto [ret, written] = writer.write(record); > if (ret) { > fmt.print("Written {} with success", written); > } > } > } catch (const std::exception& e) { > fmt.print("exception {}", e.what()); > } > } > > Working in a BDD test (I use catch2 with BDD - > https://github.com/catchorg/Catch2/blob/devel/docs/tutorial.md#bdd-style). > Yes looks like Go. > Better keeping simple until we don't have writing, reading and querying. > Then we can think about async io/io_uring/coroutines and things like that. > > From your side i might benifit if you have time a Java small program that > generates different part of the file on disk, in order to do e2e testing, > something that works like that: > $ java org.apache.iotdb.native.FileGen --record --name data.dat // > generate a recod on disk > $ java org.apache.iotdb.native.FileGen --chunk --name chunk.dat // generate > a chunk > And so on. > Having dat will allow me to create a set of pytests that: > - launch the file generation for a item > - launch a native application that read the file part > > Ok, another interesting point are the bloom filters? Why do you use it? > In parquet they use it for this reason: > *In their current format, column statistics and dictionaries can be used > for predicate pushdown. Statistics include minimum and maximum value, which > can be used to filter out values not in the range. Dictionaries are more > specific, and readers can filter out values that are between min and max > but not in the dictionary. However, when there are too many distinct > values, writers sometimes choose not to add dictionaries because of the > extra space they occupy. This leaves columns with large cardinalities and > widely separated min and max without support for predicate pushdown.* > Is it for the same reason? > As an implementation can I use this > http://algo2.iti.kit.edu/singler/publications/cacheefficientbloomfilters-wea2007.pdf > ? > *From Cache-, Hash- and Space-Efficient Bloom Filters Felix Putze, Peter > Sanders, Johannes Singler.* > > Thanks for your patience, > Best Regards, > Giorgio. </int></float></fmt></format></name,></name,></name,></name,></[email protected]></[email protected]>
