On May 18, 2011, at 9:32 AM, Sean Owen wrote: > On Wed, May 18, 2011 at 12:58 PM, Grant Ingersoll <[email protected]>wrote: > >> >> Actually, I think it is core at this point, since we moved the >> Vectorization stuff to core. Unfortunately, we need Lucene core in order to >> get the baseline definitions of TokenStreams for Vectorization. >> > > Scratch that, yeah it's reasonably embedded. > > >> >> Agreed. I'm open to the move, just wanted to hear more about what it would >> look like. I think it makes sense that all DataModel implementations move >> along with it, other than the base classes, as a MySQLJDBCDataModel is in >> the same class of things as a Mongo one (or Cassandra or HBase, etc.) >> > > I don't think most implementations should move, no. Most are "core" to the > project. Ones based on backing stores used by some but not many users are > not core.
One person's core is another's non-core. ;-) I don't personally use JDBC much any more, so it isn't a "core" thing to me and would be much more likely to use something like HBase/Cassandra/Mongo (perhaps even Solr/Lucene), etc. (The big difference, I suppose, is that Java has first order JDBC support, whereas it doesn't have first order NoSQL support) > > I'd like to reiterate I don't think this is specific to collaborative > filtering at all. Right? Yes. > > >> >> Likewise, we could do the same with our classification storage >> implementations. (In fact, I wonder if there is duplication going on between >> the Taste DataModel and the Classifier ones, at least a very low level) >> > > I strongly doubt it, the DataModel implementations are pretty specific to > recommenders for performance. Generalizing them breaks that quickly. I mostly was wondering if there were lower level commonalities in the implementations, i.e. the pieces that actually talk to the storage that could be shared. Ultimately, let's go for it and we can continue to refactor over time as the idea evolves. -Grant
