Hi,
I'm trying to index documents in elastic search. I'm using elastic search
1.2.1, from the Java API.
My cluster is remote, 3 nodes on 3 servers (one node on each server),
optimised for indexing (one shard per node, no replication).
For this, i read a CSV file, from which i generate mapping file.
For performance reasons, i try to disable _source, which works, the mapping
i can read after index creation is correct.
The thing is after inserting data, i hava nothing in my docs except the id
generated by ES. If i allow _source field, i only have my data in the
_source field.
Here is how i generate the mapping :
* XContentBuilder mapping = jsonBuilder()*
* .startObject()*
* .startObject("record")*
* //.startObject("_source").field("enabled",
false).endObject()*
* .startObject("properties");*
* for (ColumnMetadata column :
dataset.getMetadata().getColumns()) {*
*
mapping.startObject(column.geName()).field("type",
ESColumnTypeHelper.getESType(column.getType())).field("store",
"yes").field("index", "analyzed").endObject();*
* }*
* mapping.endObject()*
* .endObject()*
* .endObject();*
Then i create the index :
Settings settings = ImmutableSettings.settingsBuilder().put("cluster.name",
storeClusterName).build();
* CreateIndexRequestBuilder createIndexRequestBuilder =
client.admin().indices().prepareCreate(datasetName).addMapping("record", *
*mapping**);*
* CreateIndexRequest request = createIndexRequestBuilder.request();*
* try {*
* CreateIndexResponse createResponse =
client.admin().indices().create(request).actionGet();*
* if (!createResponse.isAcknowledged()) {*
* logger.log(Level.SEVERE, "Index creation not
acknowledged.");*
* } else {*
* logger.log(Level.INFO, "Index creation acknowledged.");*
* }*
* } catch (IndexAlreadyExistsException iae) {*
* logger.log(Level.SEVERE, "Index already exists...");*
* }*
And now how i index using the Bulk API :
* BulkRequestBuilder bulkRequest = client.prepareBulk();*
* try {*
* logger.log(Level.INFO, "Creating records");*
* for (Record record : records) {*
* IndexRequestBuilder builder =
client.prepareIndex(datasetName, "record");*
* XContentBuilder data = jsonBuilder();*
* data.startObject();*
* for (ColumnMetadata column :
dataset.getMetadata().getColumns()) {*
* Object value =
record.getCell(column.getName()).getValue();*
* if (value == null || (value instanceof String &&
value.equals("NULL"))) {*
* value = null;*
* }*
* data.field(column.getNormalizedName(), value);*
* }*
* data.endObject();*
* builder.setSource(data);*
* bulkRequest.add(builder);*
* logger.log(Level.INFO, "Creating records");*
* }*
* logger.log(Level.INFO, "Created "+
bulkRequest.numberOfActions() +" records");*
* BulkResponse bulkResponse = bulkRequest.execute().actionGet();*
* if (bulkResponse.hasFailures()) {*
* logger.log(Level.SEVERE, "Could not index : " +
bulkResponse.buildFailureMessage());*
* }*
Now for the resulting data. First the mapping i read using cURL (in this
one, i allow _default) :
*curl -XGET 'http://myserver:9200/realestateagencies/_mapping/record'*
*{ "realestateagencies" : *
* { "mappings" : *
* { "record" : *
* { "properties" : *
* {*
* "agencystatus" : { "store" : true,*
* "type" : "string"*
* },*
* "cardnumber" : { "store" : true,*
* "type" : "string"*
* },*
* "city_id" : { "store" : true,*
* "type" : "string"*
* },*
* "companyname" : { "store" : true,*
* "type" : "string"*
* },*
*[REMOVED SOME MAPPING TO SHORTEN DISPLAY]*
* "streetlabel" : { "store" : true,*
* "type" : "string"*
* },*
* "streetnumber" : { "store" : true,*
* "type" : "string"*
* },*
* "streetnumbercomplement" : { "store" : true,*
* "type" : "string"*
* },*
* "streettype" : { "store" : true,*
* "type" : "string"*
* },*
* "summarizedagency_id" : { "store" : true,*
* "type" : "string"*
* },*
* "updatedate" : { "store" : true,*
* "type" : "string"*
* },*
* "websiteurl" : { "store" : true,*
* "type" : "string"*
* }*
* } *
* }*
* }*
* }*
* }*
Can't see any indexed field in my mapping, even i explicitly gave it a
value in my mapping ( *.field("index", "analyzed")* ), but i suppose it's
because index : analyzed is the default value.
After that, the result of a query on my index type gives this :
*{*
- *took: 7*
- *timed_out: false*
- *_shards: {*
- *total: 3*
- *successful: 3*
- *failed: 0*
*}*
- *hits: {*
- *total: 100000*
- *max_score: 1*
- *hits: [*
- *{*
- *_index: realestateagencies*
- *_type: record*
- *_id: c2yWW2S3TyKkJGGFpgVS4g*
- *_score: 1*
- *_source: {*
- *id: 83163*
- *crawlsource: 1*
- *deletedate: null*
- *updatedate: null*
- *dealerkind: 1*
- *email: null*
- *name: Agence Principale - Colombes*
- *phonenumber: 0142423333*
- *latitude: 0*
- *longitude: 0*
- *normalized: 6RUEGABRIELPERI92700COLOMBES*
- *original: - 6 RUE GABRIEL PERI 92700 COLOMBES*
- *street1: 6 rue Gabriel Peri*
- *street2: null*
- *streetlabel: Gabriel Peri*
- *streetnumber: 6*
- *streetnumbercomplement: 0*
- *streettype: 20*
- *cardnumber: null*
- *companyname: null*
- *createdate: 1969-12-31T23:00:00.000Z*
- *faxnumber: null*
- *logourl: null*
- *normalizedname: AGENCEPRINCIPALECOLOMBES*
- *rcsnumber: null*
- *sirennumber: 450499298*
- *siretnumber: null*
- *websiteurl: null*
- *agencystatus: 1*
- *keyperportal: 484737|AGENCEPRINCIPALECOLOMBES*
- *reconciliationpolicy: 1*
- *city_id: 17988*
- *summarizedagency_id: 408837*
*}*
*},* [REST IS REMOVED]
And of course if I disable _source field, resulting docs are empty.
I cannot see what i'm doing wrong here. Anyone can see something wrong ?
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d226606c-fbb4-4167-87ec-92f2f8fe7728%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.