Hi Sage,
Thank you for the reply,

On Thu, Oct 1, 2015, at 05:33 AM, Sage Weil wrote:
> > - Each message is an object with some unique ID. Use omap to store all
> > its features in the same object.
> > - For each time period (which will have to be pre-specified to, say, an
> > hour), we have an object which contains a list of ID's, as a bytestring
> > of contatenated ID's. This should make expiring old messages trivial.
> 
> This seems reasonable.  There's a rados append operation so you can fire 
> off 2 IOs to write the message and do the append.  You may want to batch 
> the appends on the inject process to reduce load... or it might not 
> matter, depends on the data rate.
> 
> You could also use omap for this if you wan to query by time range
> (within the per-day or per-hour object).
> 

What do mean by this exactly? Is there a way to use timestamps as keys
and query them by range? That would be very useful, but I don't see
anything like that in the librados api. (I see rados_read_op_omap_cmp,
which seems to be for comparing values, and not keys?)

> > - For each feature, we have a timestamped index (like
> > 20150930-from-...@bar.com or
> > 20150813-has-attachment-with-hash-123abddeadbeef) the which contains a
> > list of ID's. 
> > - Hopefully use Rados classes to index/feature-extract on the OSD's. 
> >
> > How does this sound? One glaring omission is that I do not know how to
> > create indices which would support querying by inequality/ranges ('find
> > all messages between 1000 and 2000 bytes').
> 
> This I'm less sure about.  You could use a rados class to do teh feature 
> extraction and store in omap in the same object, but rados doesn't 
> give you a cross-object index.  If you are going to do any queries I 
> would put it in a database of some sort (maybe something like 
> cassandra or hbase?).

That seems like a very good idea-- using either of those would give us
the ability to connect Spark or similar tool. Only disadvantage would be
that there would be two components which would need to be synchronized,
though that in itself sounds like an interesting research topic. I will
report back when I've tried it. 

Thank you,
Tom
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to