On Tue, Aug 04, 2009 at 05:32:02PM -0700, K. John Wu wrote:
> I would prefer to stay away from boost library because it introduces
> dependencies for FastBit only to make use of something relatively
> minor..

The reason why I references this buffer model was only to give feel how
such a raw byte buffer could be implemented in FastBit (I did not mean
to propose the inclusion of Boost in FastBit). This abstraction has
several benefits such as 

* performance. Data is not copied but stored in its original place:
  while std::string::c_str performs a deep copy up to the first NUL
  byte, the construction of a buffer would only point into the memory
  location.

* decoupling from the storage interface. The user can decide how to
  store the raw data (e.g., in a std::string, std::vector or simple
  uchar[] array), and multiple overloaded buffer constructors perform
  the appropriate conversion with the raw data as argument.

A downside is that the buffer access presupposes that the pointee is
alive.

> Another alternative might be to input values of the new data type
> with a pair of arrays (one for const char*, another for lengths of
> the buffers).  Since this is different from other data types, guess
> we will also need to make a new column type to avoid messing with
> existing functions.  

What is the benefit of introducing a separate array for the length?
I could image that it has a negative impact on performance due to
locality reasons.

> This will affect how the values are outputted too..

What about printing raw data in hex?

> Anyway, looks like we will need to introduce a new column type one
> way of another.  Using std::string might be a little cleaner as far
> as I can imagine right now.  What do you think?

I think std::string forms a possible interface. However, unlike data in
a std::vector, the current ISO C++ standard does not guarantee that a
std::string is allocated in a contiguous chunk of memory, that's why
c_str() exists to create a string copy. Yet in practice, the majority of
STL implementations store string data contiguously [1]. So using
std::string string should not yield any weird side effects and &str[0]
or str.data() both return a NUL-terminated contiguous string.

   Matthias

[1] 
http://herbsutter.wordpress.com/2008/04/07/cringe-not-vectors-are-guaranteed-to-be-contiguous/#comment-483
-- 
Matthias Vallentin
[email protected]
http://www.icir.org/matthias
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to