I am having an issue with the jdbc river collapsing during the bulk insert

i have records that have some single value properties, and can have multiple
value properties (names, addresses and emails)


there are a total of around 4.5 million rows that collapse down to 600k

if the river sql criteria is set to be where id="001", it works fine

but during the bulk process ie all of my rows, only one property that can
have multiple values is correct, other properties are missing data


here is an example of what the query output that the river is using to
collapse to JSON
it has 2 middle names, 2 last names, and 4 addresses

_id     pref_mail_name  pref_class_year record_status_code      first_name      
middle_name
last_name       street1 street2 street3 city    state_code      zipcode 
email_address
0000003934      Kelly A. Draper 1999    A       Kelly   Ann     Draper  13679 
Stoney Springs Dr          
Chardon OH      44024-8918      [email protected]
0000003934      Kelly A. Draper 1999    A       Kelly   Ann     Draper  1400 
McDonald Investment
Ctr     800 Superior Ave E Ste 1400             Cleveland       OH      
44114-2617
[email protected]
0000003934      Kelly A. Draper 1999    A       Kelly   Ann     Draper  13156 
Aldenshire Dr              
Chardon OH      44024-8921      [email protected]
0000003934      Kelly A. Draper 1999    A       Kelly   Ann     Draper  100 7th 
Ave Ste 150              
Chardon OH      44024-7808      [email protected]
0000003934      Kelly A. Draper 1999    A       Kelly   Ann     Draper  13765 
Equestrian Dr              
Burton  OH      44021-9552      [email protected]
0000003934      Kelly A. Draper 1999    A       Kelly   A.      McElroy 13679 
Stoney Springs Dr          
Chardon OH      44024-8918      [email protected]
0000003934      Kelly A. Draper 1999    A       Kelly   A.      McElroy 1400 
McDonald Investment
Ctr     800 Superior Ave E Ste 1400             Cleveland       OH      
44114-2617
[email protected]
0000003934      Kelly A. Draper 1999    A       Kelly   A.      McElroy 13156 
Aldenshire Dr              
Chardon OH      44024-8921      [email protected]
0000003934      Kelly A. Draper 1999    A       Kelly   A.      McElroy 100 7th 
Ave Ste 150              
Chardon OH      44024-7808      [email protected]
0000003934      Kelly A. Draper 1999    A       Kelly   A.      McElroy 13765 
Equestrian Dr              
Burton  OH      44021-9552      [email protected]

after a river run, the indexed doc has 4 addresses, but only one middle name
and one last name, the other never was indexed

   "_source": {
      "pref_mail_name": "Kelly A. Draper",
      "street2": [
         " ",
         "800 Superior Ave E Ste 1400"
      ],
      "street1": [
         "13679 Stoney Springs Dr",
         "1400 McDonald Investment Ctr",
         "13156 Aldenshire Dr",
         "100 7th Ave Ste 150",
         "13765 Equestrian Dr"
      ],
      "state_code": "OH",
      "middle_name": "A.",
      "zipcode": [
         "44024-8918",
         "44114-2617",
         "44024-8921",
         "44024-7808",
         "44021-9552"
      ],
      "pref_class_year": "1999",
      "record_status_code": "A",
      "city": [
         "Chardon",
         "Cleveland",
         "Burton"
      ],
      "first_name": "Kelly",
      "last_name": "McElroy",
      "street3": " ",
      "email_address": "[email protected]"
   }
}


I have attempted using bracket notation for creating objects, but the same
issue exists, only now the properties are nested


my river looks like this

PUT /_river/matcher/_meta
{
    "type" : "jdbc",
    "jdbc" : {
        "url" : "serverurl",
        "user" : "USER",
        "password" : "#########",
        "sql" : "select e.id_number as \"_id\", e.pref_mail_name as
\"pref_mail_name\", e.pref_class_year as \"pref_class_year\",
e.record_status_code as \"record_status_code\", a.street1 as \"street1\",
a.street2 as \"street2\", a.street3 as \"street3\", a.city as \"city\",
a.state_code as \"state_code\", a.zipcode as \"zipcode\", n.first_name  as
\"first_name\", n.middle_name as \"middle_name\", n.last_name as
\"last_name\", email.email_address as \"email_address\" from entity e  left
join name n on e.id_number = n.id_number left join email on e.id_number =
email.id_number left join address a on e.id_number = a.id_number where
e.person_or_org = 'P' and e.record_status_code IN ('A', 'L', 'D') ",
        "index" : "matcher",
        "type" : "entity",
        "bulk_size" : 160,
        "max_bulk_requests" : 5        
    }
}

let me know if i can provide additional info 




--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/JDBC-river-query-results-collapsing-to-JSON-issue-tp4054562.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1398177305643-4054562.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to