You should rethink your SQL statement because JOIN does not work like this,
the result will be an exponential number of rows.

You have to define conditions so that rows of given tables match.

For example

SELECT movies.*, persons.name, tags.value
FROM movies m
JOIN persons on persons.id = movies....
JOIN tags on tags.id = movies....

Jörg

On Fri, Jan 16, 2015 at 11:34 AM, Stalinko <[email protected]> wrote:

> I'm trying to index my movie DB into ES using MySQL JDBC river.
>
> The problem is:
> there are 3 tables:
> movies - has many columns
> persons - names of the people who participated in some movie
> tags - movie tags
>
> I'm indexing it using such query (it's not exact query, just pseudo-code to
> explain the problem):
>
> SELECT movies.*, persons.name, tags.value
> FROM movies m
> JOIN persons
> JOIN tags
>
> There are quite many movies, each has many columns and each of movies has
> something like 10-30 persons and 1000 tags as well.
> Thus because of the joins all the movies data is duplicated 10.000-30.000
> times in the resulting set.
> That leads to a great overload, one indexation takes more than hour, but I
> need to re-index the data each day.
>
> Is there a way to index arrays without duplicating all the data?
> I tried it in that way - split the query into 3 ones:
>
> SELECT id as _id, * FROM movies
> SELECT movie_id as _id, name FROM persons
> SELECT movie_id as _id, value FROM tags
>
> But these queries overwrite each other instead of updating.
>
> Can anybody help me? Looks like the plugin wasn't designed for such cases
> and I need to write my own strategy (however I don't write in Java :()
>
>
>
>
> --
> View this message in context:
> http://elasticsearch-users.115913.n3.nabble.com/MySQL-JDBC-river-indexing-large-arrays-tp4069168.html
> Sent from the ElasticSearch Users mailing list archive at Nabble.com.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/1421404458605-4069168.post%40n3.nabble.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHdYnaZ1Q3scosmhDJw%2BqKhLhUdKjhWcd8MifzG6_csDA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to