Hi,
this is an example of a Entity Map definition for setting up a full text
search with an external text index engine (Elasticsearch):
...
<#entMap> a text:EntityMap ;
text:defaultField "text1" ; # Must be defined in the text:map
text:map (
[ text:field "text1" ; text:predicate kog:definition ]
[ text:field "text1" ; text:predicate kog:value ]
[ text:field "text2" ; text:predicate kog:segmentValue ]
[ text:field "text2" ; text:predicate kog:term ]
) .
During the indexing with *jena.textindexer*, Elasticsearch 6.x won't
accept duplicate keys (whereas 5.x did accept) and it will reject (see
the error below).
Note: It won't complain about "text1" which is defined as "defaultField"
defined, but "text2" only.
To avoid this I tried to work with arrays, but it appears they aren't
supported, here an example:
text:map (
[ text:field "text1" ;
text:predicate
[text:field "text1definition" ; kog:definition ],
[text:field "text1value" ; kog:value];
]
[ text:field "text2" ;
text:predicate
[text:field "text2segmentValue" ; kog:segmentValue],
[text:field "text2term" ; kog:term ];
]
------------------------------------------------------------------------
INFO no modules loaded
INFO loaded plugin [org.elasticsearch.index.reindex.ReindexPlugin]
INFO loaded plugin [org.elasticsearch.join.ParentJoinPlugin]
INFO loaded plugin [org.elasticsearch.percolator.PercolatorPlugin]
INFO loaded plugin [org.elasticsearch.script.mustache.MustachePlugin]
INFO loaded plugin [org.elasticsearch.transport.Netty4Plugin]
org.apache.jena.query.text.TextIndexException: Unable to Index the
Entity in ElasticSearch.
at
org.apache.jena.query.text.es.TextIndexES.addEntity(TextIndexES.java:280)
at jena.textindexer.exec(textindexer.java:139)
at jena.cmd.CmdMain.mainMethod(CmdMain.java:93)
at jena.cmd.CmdMain.mainRun(CmdMain.java:58)
at jena.cmd.CmdMain.mainRun(CmdMain.java:45)
at jena.textindexer.main(textindexer.java:52)
Caused by: java.util.concurrent.ExecutionException:
RemoteTransportException[[es-test][127.0.0.1:9300][indices:data/write/update]];
nested:
RemoteTransportException[[es-test][127.0.0.1:9300][indices:data/write/update[s]]];
nested: MapperParsingException[failed to parse]; nested:
IOException[Duplicate field 'text2'
at [Source:
*org.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper*@1d3cd3e3;
line: 1, column: 24]];
at
org.elasticsearch.common.util.concurrent.BaseFuture$Sync.getValue(BaseFuture.java:265)
at
org.elasticsearch.common.util.concurrent.BaseFuture$Sync.get(BaseFuture.java:252)
at
org.elasticsearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:94)
at
org.apache.jena.query.text.es.TextIndexES.addEntity(TextIndexES.java:275)
... 5 more
Caused by:
RemoteTransportException[[es-test][127.0.0.1:9300][indices:data/write/update]];
nested:
RemoteTransportException[[es-test][127.0.0.1:9300][indices:data/write/update[s]]];
nested: MapperParsingException[failed to parse]; nested:
IOException[Duplicate field 'text2'
at [Source:
*org.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper*@1d3cd3e3;
line: 1, column: 24]];
Caused by:
RemoteTransportException[[es-test][127.0.0.1:9300][indices:data/write/update[s]]];
nested: MapperParsingException[failed to parse]; nested:
IOException[Duplicate field 'text2'
at [Source:
*org.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper*@1d3cd3e3;
line: 1, column: 24]];
Caused by: MapperParsingException[failed to parse]; nested:
IOException[Duplicate field 'text2'
at [Source:
*org.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper*@1d3cd3e3;
line: 1, column: 24]];
at
org.elasticsearch.index.mapper.DocumentParser.wrapInMapperParsingException(DocumentParser.java:171)
at
org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:72)
at
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:263)
at
org.elasticsearch.index.shard.IndexShard.prepareIndex(IndexShard.java:725)
at
org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:702)
at
org.elasticsearch.index.shard.IndexShard.applyIndexOperationOnPrimary(IndexShard.java:682)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.lambda$executeIndexRequestOnPrimary$2(TransportShardBulkAction.java:560)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.executeOnPrimaryWhileHandlingMappingUpdates(TransportShardBulkAction.java:579)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.executeIndexRequestOnPrimary(TransportShardBulkAction.java:558)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.executeIndexRequest(TransportShardBulkAction.java:141)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.executeBulkItemRequest(TransportShardBulkAction.java:247)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:124)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:111)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:73)
at
org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:1017)
at
org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:995)
at
org.elasticsearch.action.support.replication.ReplicationOperation.execute(ReplicationOperation.java:101)
at
org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:356)
at
org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:296)
at
org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:958)
at
org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:955)
at
org.elasticsearch.index.shard.IndexShardOperationPermits.acquire(IndexShardOperationPermits.java:271)
at
org.elasticsearch.index.shard.IndexShardOperationPermits.acquire(IndexShardOperationPermits.java:238)
at
org.elasticsearch.index.shard.IndexShard.acquirePrimaryOperationPermit(IndexShard.java:2249)
at
org.elasticsearch.action.support.replication.TransportReplicationAction.acquirePrimaryShardReference(TransportReplicationAction.java:967)
at
org.elasticsearch.action.support.replication.TransportReplicationAction.access$500(TransportReplicationAction.java:97)
at
org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.doRun(TransportReplicationAction.java:317)
at
org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at
org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:292)
at
org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:279)
at
org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:251)
at
org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at
org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:309)
at
org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66)
at
org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:665)
at
org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:723)
at
org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
at java.lang.Thread.run(Thread.java:844)
Caused by: java.io.IOException: Duplicate field 'text2'
at [Source:
org*.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper*@1d3cd3e3;
line: 1, column: 24]
at
com.fasterxml.jackson.core.json.JsonReadContext._checkDup(JsonReadContext.java:204)
at
com.fasterxml.jackson.core.json.JsonReadContext.setCurrentName(JsonReadContext.java:198)
at
com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:777)
at
org.elasticsearch.common.xcontent.json.JsonXContentParser.nextToken(JsonXContentParser.java:53)
at
org.elasticsearch.index.mapper.DocumentParser.innerParseObject(DocumentParser.java:405)
at
org.elasticsearch.index.mapper.DocumentParser.parseObjectOrNested(DocumentParser.java:380)
at
org.elasticsearch.index.mapper.DocumentParser.internalParseDocument(DocumentParser.java:95)
at
org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:69)
... 38 more