Hi Ivan, thanks for all this. I was getting results but without my fields. Knowing what you told me about default behaviour of queries to only return source, this all makes sense now. Actually i was thinking of disabling source for indexing performance boost. But if i loose performance afterwards, this seems not a good idea.
Thanks a lot ! Le lundi 23 juin 2014 16:05:06 UTC+2, Ivan Brusic a écrit : > > What exactly is the issue? Are you getting back results, just with no > data? By default, a query will only return the _source field. If you want > to return other stored fields, then you would need to explicit name them: > > > http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-fields.html#search-request-fields > > Also, disabling source does not necessarily increase performance. Lucene > would need to execute a seek for each individual field instead of just one > for the source. If you are request numerous fields, then using stored > fields actually decreases performance. It all depends on the size of your > document/fields and the number of fields used. > > Cheers, > > Ivan > > > On Sun, Jun 22, 2014 at 3:00 AM, Frederic Esnault <[email protected] > <javascript:>> wrote: > >> Hi, >> >> I'm trying to index documents in elastic search. I'm using elastic search >> 1.2.1, from the Java API. >> My cluster is remote, 3 nodes on 3 servers (one node on each server), >> optimised for indexing (one shard per node, no replication). >> For this, i read a CSV file, from which i generate mapping file. >> For performance reasons, i try to disable _source, which works, the >> mapping i can read after index creation is correct. >> The thing is after inserting data, i hava nothing in my docs except the >> id generated by ES. If i allow _source field, i only have my data in the >> _source field. >> >> Here is how i generate the mapping : >> >> * XContentBuilder mapping = jsonBuilder()* >> * .startObject()* >> * .startObject("record")* >> * //.startObject("_source").field("enabled", >> false).endObject()* >> * .startObject("properties");* >> * for (ColumnMetadata column : >> dataset.getMetadata().getColumns()) {* >> * >> mapping.startObject(column.geName()).field("type", >> ESColumnTypeHelper.getESType(column.getType())).field("store", >> "yes").field("index", "analyzed").endObject();* >> * }* >> * mapping.endObject()* >> * .endObject()* >> * .endObject();* >> >> Then i create the index : >> >> Settings settings = ImmutableSettings.settingsBuilder().put("cluster.name", >> storeClusterName).build(); >> >> * CreateIndexRequestBuilder createIndexRequestBuilder = >> client.admin().indices().prepareCreate(datasetName).addMapping("record", * >> *mapping**);* >> * CreateIndexRequest request = createIndexRequestBuilder.request();* >> * try {* >> * CreateIndexResponse createResponse = >> client.admin().indices().create(request).actionGet();* >> * if (!createResponse.isAcknowledged()) {* >> * logger.log(Level.SEVERE, "Index creation not >> acknowledged.");* >> * } else {* >> * logger.log(Level.INFO, "Index creation acknowledged.");* >> * }* >> * } catch (IndexAlreadyExistsException iae) {* >> * logger.log(Level.SEVERE, "Index already exists...");* >> * }* >> >> And now how i index using the Bulk API : >> >> * BulkRequestBuilder bulkRequest = client.prepareBulk();* >> * try {* >> * logger.log(Level.INFO, "Creating records");* >> * for (Record record : records) {* >> * IndexRequestBuilder builder = >> client.prepareIndex(datasetName, "record");* >> * XContentBuilder data = jsonBuilder();* >> * data.startObject();* >> * for (ColumnMetadata column : >> dataset.getMetadata().getColumns()) {* >> * Object value = >> record.getCell(column.getName()).getValue();* >> * if (value == null || (value instanceof String && >> value.equals("NULL"))) {* >> * value = null;* >> * }* >> * data.field(column.getNormalizedName(), value);* >> * }* >> * data.endObject();* >> * builder.setSource(data);* >> * bulkRequest.add(builder);* >> * logger.log(Level.INFO, "Creating records");* >> * }* >> * logger.log(Level.INFO, "Created "+ >> bulkRequest.numberOfActions() +" records");* >> >> * BulkResponse bulkResponse = >> bulkRequest.execute().actionGet();* >> * if (bulkResponse.hasFailures()) {* >> * logger.log(Level.SEVERE, "Could not index : " + >> bulkResponse.buildFailureMessage());* >> * }* >> >> Now for the resulting data. First the mapping i read using cURL (in this >> one, i allow _default) : >> >> >> *curl -XGET 'http://myserver:9200/realestateagencies/_mapping/record >> <http://myserver:9200/realestateagencies/_mapping/record>' * >> >> *{ "realestateagencies" : * >> * { "mappings" : * >> * { "record" : * >> * { "properties" : * >> * {* >> * "agencystatus" : { "store" : true,* >> * "type" : "string"* >> * },* >> * "cardnumber" : { "store" : true,* >> * "type" : "string"* >> * },* >> * "city_id" : { "store" : true,* >> * "type" : "string"* >> * },* >> * "companyname" : { "store" : true,* >> * "type" : "string"* >> * },* >> *[REMOVED SOME MAPPING TO SHORTEN DISPLAY]* >> * "streetlabel" : { "store" : true,* >> * "type" : "string"* >> * },* >> * "streetnumber" : { "store" : true,* >> * "type" : "string"* >> * },* >> * "streetnumbercomplement" : { "store" : true,* >> * "type" : "string"* >> * },* >> * "streettype" : { "store" : true,* >> * "type" : "string"* >> * },* >> * "summarizedagency_id" : { "store" : true,* >> * "type" : "string"* >> * },* >> * "updatedate" : { "store" : true,* >> * "type" : "string"* >> * },* >> * "websiteurl" : { "store" : true,* >> * "type" : "string"* >> * }* >> * } * >> * }* >> * }* >> * }* >> * }* >> >> Can't see any indexed field in my mapping, even i explicitly gave it a >> value in my mapping ( *.field("index", "analyzed")* ), but i suppose >> it's because index : analyzed is the default value. >> After that, the result of a query on my index type gives this : >> >> *{* >> >> - *took: 7* >> - *timed_out: false* >> - *_shards: {* >> - *total: 3* >> - *successful: 3* >> - *failed: 0* >> *}* >> - *hits: {* >> - *total: 100000* >> - *max_score: 1* >> - *hits: [* >> - *{* >> - *_index: realestateagencies* >> - *_type: record* >> - *_id: c2yWW2S3TyKkJGGFpgVS4g* >> - *_score: 1* >> - *_source: {* >> - *id: 83163* >> - *crawlsource: 1* >> - *deletedate: null* >> - *updatedate: null* >> - *dealerkind: 1* >> - *email: null* >> - *name: Agence Principale - Colombes* >> - *phonenumber: 0142423333* >> - *latitude: 0* >> - *longitude: 0* >> - *normalized: 6RUEGABRIELPERI92700COLOMBES* >> - *original: - 6 RUE GABRIEL PERI 92700 COLOMBES* >> - *street1: 6 rue Gabriel Peri* >> - *street2: null* >> - *streetlabel: Gabriel Peri* >> - *streetnumber: 6* >> - *streetnumbercomplement: 0* >> - *streettype: 20* >> - *cardnumber: null* >> - *companyname: null* >> - *createdate: 1969-12-31T23:00:00.000Z* >> - *faxnumber: null* >> - *logourl: null* >> - *normalizedname: AGENCEPRINCIPALECOLOMBES* >> - *rcsnumber: null* >> - *sirennumber: 450499298* >> - *siretnumber: null* >> - *websiteurl: null* >> - *agencystatus: 1* >> - *keyperportal: 484737|AGENCEPRINCIPALECOLOMBES* >> - *reconciliationpolicy: 1* >> - *city_id: 17988* >> - *summarizedagency_id: 408837* >> *}* >> *},* [REST IS REMOVED] >> >> >> And of course if I disable _source field, resulting docs are empty. >> >> I cannot see what i'm doing wrong here. Anyone can see something wrong ? >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/d226606c-fbb4-4167-87ec-92f2f8fe7728%40googlegroups.com >> >> <https://groups.google.com/d/msgid/elasticsearch/d226606c-fbb4-4167-87ec-92f2f8fe7728%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7637fe2d-9de8-49ef-bf5b-1e65890c2059%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
