Re: JDBC river query results collapsing to JSON issue

[email protected] Tue, 22 Apr 2014 09:52:30 -0700

>From what I understand, you want a single ES document with name:address
relations as 1:N relation, where the only ID available is for the name
(here in the example: 0000003934 for Kelly A. Draper).


It would help to define more identifiers for each address also, so you
could index the addresses in one index, and person names in the other
index, with two rivers.

The support for nested objects in SQL pseudo column bracket notation is
somewhat limited in JDBC river. If anyone feels like improving this,
patches/pull requests would be very welcome!

At the moment I feel without any identifiers or given enumeration scheme,
it is impossible to identify a sequence of JSON objects in a nested
document that can be collapsed/grouped.

Jörg




On Tue, Apr 22, 2014 at 4:35 PM, jrizzi1 <[email protected]> wrote:

> I am having an issue with the jdbc river collapsing during the bulk insert
>
> i have records that have some single value properties, and can have
> multiple
> value properties (names, addresses and emails)
>
>
> there are a total of around 4.5 million rows that collapse down to 600k
>
> if the river sql criteria is set to be where id="001", it works fine
>
> but during the bulk process ie all of my rows, only one property that can
> have multiple values is correct, other properties are missing data
>
>
> here is an example of what the query output that the river is using to
> collapse to JSON
> it has 2 middle names, 2 last names, and 4 addresses
>
> _id     pref_mail_name  pref_class_year record_status_code      first_name
>      middle_name
> last_name       street1 street2 street3 city    state_code      zipcode
> email_address
> 0000003934      Kelly A. Draper 1999    A       Kelly   Ann     Draper
>  13679 Stoney Springs Dr
> Chardon OH      44024-8918      [email protected]
> 0000003934      Kelly A. Draper 1999    A       Kelly   Ann     Draper
>  1400 McDonald Investment
> Ctr     800 Superior Ave E Ste 1400             Cleveland       OH
>  44114-2617
> [email protected]
> 0000003934      Kelly A. Draper 1999    A       Kelly   Ann     Draper
>  13156 Aldenshire Dr
> Chardon OH      44024-8921      [email protected]
> 0000003934      Kelly A. Draper 1999    A       Kelly   Ann     Draper
>  100 7th Ave Ste 150
> Chardon OH      44024-7808      [email protected]
> 0000003934      Kelly A. Draper 1999    A       Kelly   Ann     Draper
>  13765 Equestrian Dr
> Burton  OH      44021-9552      [email protected]
> 0000003934      Kelly A. Draper 1999    A       Kelly   A.      McElroy
> 13679 Stoney Springs Dr
> Chardon OH      44024-8918      [email protected]
> 0000003934      Kelly A. Draper 1999    A       Kelly   A.      McElroy
> 1400 McDonald Investment
> Ctr     800 Superior Ave E Ste 1400             Cleveland       OH
>  44114-2617
> [email protected]
> 0000003934      Kelly A. Draper 1999    A       Kelly   A.      McElroy
> 13156 Aldenshire Dr
> Chardon OH      44024-8921      [email protected]
> 0000003934      Kelly A. Draper 1999    A       Kelly   A.      McElroy
> 100 7th Ave Ste 150
> Chardon OH      44024-7808      [email protected]
> 0000003934      Kelly A. Draper 1999    A       Kelly   A.      McElroy
> 13765 Equestrian Dr
> Burton  OH      44021-9552      [email protected]
>
> after a river run, the indexed doc has 4 addresses, but only one middle
> name
> and one last name, the other never was indexed
>
>    "_source": {
>       "pref_mail_name": "Kelly A. Draper",
>       "street2": [
>          " ",
>          "800 Superior Ave E Ste 1400"
>       ],
>       "street1": [
>          "13679 Stoney Springs Dr",
>          "1400 McDonald Investment Ctr",
>          "13156 Aldenshire Dr",
>          "100 7th Ave Ste 150",
>          "13765 Equestrian Dr"
>       ],
>       "state_code": "OH",
>       "middle_name": "A.",
>       "zipcode": [
>          "44024-8918",
>          "44114-2617",
>          "44024-8921",
>          "44024-7808",
>          "44021-9552"
>       ],
>       "pref_class_year": "1999",
>       "record_status_code": "A",
>       "city": [
>          "Chardon",
>          "Cleveland",
>          "Burton"
>       ],
>       "first_name": "Kelly",
>       "last_name": "McElroy",
>       "street3": " ",
>       "email_address": "[email protected]"
>    }
> }
>
>
> I have attempted using bracket notation for creating objects, but the same
> issue exists, only now the properties are nested
>
>
> my river looks like this
>
> PUT /_river/matcher/_meta
> {
>     "type" : "jdbc",
>     "jdbc" : {
>         "url" : "serverurl",
>         "user" : "USER",
>         "password" : "#########",
>         "sql" : "select e.id_number as \"_id\", e.pref_mail_name as
> \"pref_mail_name\", e.pref_class_year as \"pref_class_year\",
> e.record_status_code as \"record_status_code\", a.street1 as \"street1\",
> a.street2 as \"street2\", a.street3 as \"street3\", a.city as \"city\",
> a.state_code as \"state_code\", a.zipcode as \"zipcode\", n.first_name  as
> \"first_name\", n.middle_name as \"middle_name\", n.last_name as
> \"last_name\", email.email_address as \"email_address\" from entity e  left
> join name n on e.id_number = n.id_number left join email on e.id_number =
> email.id_number left join address a on e.id_number = a.id_number where
> e.person_or_org = 'P' and e.record_status_code IN ('A', 'L', 'D') ",
>         "index" : "matcher",
>         "type" : "entity",
>         "bulk_size" : 160,
>         "max_bulk_requests" : 5
>     }
> }
>
> let me know if i can provide additional info
>
>
>
>
> --
> View this message in context:
> http://elasticsearch-users.115913.n3.nabble.com/JDBC-river-query-results-collapsing-to-JSON-issue-tp4054562.html
> Sent from the ElasticSearch Users mailing list archive at Nabble.com.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/1398177305643-4054562.post%40n3.nabble.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHk7-%3Dj%2BQAFPPy%3Dw4%2BiXD%3D%3Dx2BT%2Bao%2BLQQ0DB-hjKiHgw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: JDBC river query results collapsing to JSON issue

Reply via email to