It looks like that indexing code might not be correct. I just tried this 
code and it works for me:

      try {
        String fileContents = readContent( new File( "fn6742.pdf" ) );
     
        try {
          DeleteIndexResponse deleteIndexResponse = new 
DeleteIndexRequestBuilder( client.admin().indices(), INDEX_NAME 
).execute().actionGet();
          if (deleteIndexResponse.isAcknowledged() ) {
            System.out.println( "Deleted index" );
          } 
        }
        catch (Exception e) {
          //ignore
        }
         
        CreateIndexResponse createIndexResponse = new 
CreateIndexRequestBuilder( client.admin().indices(), INDEX_NAME 
).execute().actionGet();
         
        if ( createIndexResponse.isAcknowledged() ) {
          System.out.println( "Created index" );
        }
         
        PutMappingResponse putMappingResponse = new 
PutMappingRequestBuilder(
            client.admin().indices() ).setIndices(INDEX_NAME).setType( 
DOCUMENT_TYPE ).setSource(
            XContentFactory.jsonBuilder().startObject()
              .field("doc").startObject()
                .field( "properties" ).startObject()
                  .field( "file" ).startObject()
                    .field( "term_vector", "with_positions_offsets" )
                    .field( "store", "yes" )
                    .field( "type", "attachment" )
                    .field("fields").startObject()
                      .field("file").startObject()
                        .field("store", "yes")
                      .endObject()
                    .endObject()
                  .endObject()
                .endObject()
              .endObject()
            .endObject()
        ).execute().actionGet();
         
        if ( putMappingResponse.isAcknowledged() ) {
          System.out.println( "Successfully defined mapping" );
        }
         
        IndexResponse indexResponse = client.prepareIndex( INDEX_NAME , 
DOCUMENT_TYPE, "1")
          .setSource(XContentFactory.jsonBuilder()
          .startObject()
            .field( "file").startObject()
              .field("content", fileContents)
              .field("_indexed_chars", -1)
            .endObject()
          .endObject()
        ).execute().actionGet();
         
        System.out.println( "Document indexed success: " + 
indexResponse.isCreated() );
      } catch ( Exception e ) {
        System.out.println(e.toString());
      }


And then when I query:

{
  "fields": "*",
  "query": {
    "match_all": {}
  }
}

I get back this:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "msdocs",
      "_type" : "doc",
      "_id" : "1",
      "_score" : 1.0,
      "fields" : {
        "file" : [ "\n1\nISL99201\nCAUTION: These devices are sensitive to 
electrostatic discharge; follow proper IC Handling 
Procedures.\n1-888-INTERSIL or 1-888-468-3774"]
      }
    } ]
  }
}

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9fce9018-8576-4bce-ba42-025120097fe2%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to