kangpinghuang opened a new issue #1305: Add new file format for storage segment
URL: https://github.com/apache/incubator-doris/issues/1305
 
 
   **Is your feature request related to a problem? Please describe.**
   Now the segment format in BE storage is orc-like format. There are some 
problems:
   1. the file header will be modified after we flush all data, It does not 
apply to cloud environment because the files in distribute file system(eg: 
hdfs), s3 and so do not support random write.
   2. random seek. When read the stream, you first read the StreamHead(8 bytes) 
first, than read the stream data. I think this mechinism is not good.
   3. there are no block cache.
   4. string is stored in plain
   5. it is hard to add secondary index
   6. the data is store in static row number block
   ....
   So, I would like to add a new format segment for BE to solve the problems 
mentioned above.
   
   To goals to achive include:
   1. write file meta to the footer of the segment file
   2. to support block cache
   3. to support secondary indexes, eg: bitmap index
   4. to support dict encodeing string storage
   5. construct a block in configured size
   6. to support extend the encoding and compression easily.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to