Synchronization of data is a very broad question. This is just because the data organization in an RDBMS is very different from ES. You surely know that. See also object-relational impedance mismatch http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch
The JDBC river plugin allows you to define SQL statements so you can easily construct JSON out if it, for indexing into ES. If you can map identifiers from your RDBMS to JSON doc IDs and allocate the _id field in the JDBC river plugin, you are lucky. In that case you can just overwrite existing docs in ES to keep up with the most recent version. Synchronization also includes modifications and deletions to avoid stale docs, and transactional ACID properties. I have no general solution for this. The best approach is to provide timewindowed indices and drop indices that are too old, similar to what Logstash does. Jörg On Thu, Sep 11, 2014 at 3:39 PM, James <[email protected]> wrote: > Hi Jorg, > > Thank you for the reply. Yes I meant the elasticsearch river. Simply put, > I want to syncronize the entries in my SQL database with my elasticsearch, > so I can use elasicsearch for searching and not doing fulltext search. I > want to know that when a new item gets added or removed from that database > that it also gets added / removed from elasicsearch. > > My understand, which might be wrong, is I can either use the PHP > elasticsearch library to push updates (adds / removes) to elasticsearch > when new items are added to SQL: > > http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-index_.html > > Or I can use the river JDBC river plugin for elasticsearch to connect to > my database directly and syncronize elasticsearch with the SQL database. > > My two questions are: > > 1. Is my understanding above correct > 2. Does one option have advantages over the other > > - James > > On Wednesday, September 10, 2014 10:59:18 AM UTC+1, James wrote: >> >> Hi, >> >> I'm setting up a system where I have a main SQL database which is synced >> with elasticsearch. My plan is to use the main PHP library for >> elasticsearch. >> >> I was going to have a cron run every thirty minuets to check for items in >> my database that not only have an "active" flag but that also do not have >> an "indexed" flag, that means I need to add them to the index. Then I was >> going to add that item to the index. Since I am using taking this path, it >> doesn't seem like I need the JDBC driver, as I can add items to >> elasticsearch using the PHP library. >> >> So, my question is, can I get away without using the JDBC driver? >> >> James >> >> >> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/1d5fe901-fd0e-4663-9c68-5f7cf8092cf1%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/1d5fe901-fd0e-4663-9c68-5f7cf8092cf1%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFoUMAv3R0b1NN%2B9B7eBNCcqpjg7tTMuNQPzCgGGupkQw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
