[
https://issues.apache.org/jira/browse/NIFI-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Koji Kawamura updated NIFI-3248:
--------------------------------
Description:
GetSolr holds the last query timestamp so that it only fetches documents those
have been added or updated since the last query.
However, GetSolr misses some of those updated documents, and once the documents
date field value becomes older than last query timestamp, the document won't be
able to be queried by GetSolr any more.
This JIRA is for tracking the process of investigating this behavior, and
discussion on them.
Here are things that can be a cause of this behavior:
|#|Short description|Should we address it?|
|1|Timestamp range filter, curly or square bracket?|No|
- However, since the timestamp filter is not properly formatted as a valid time
range filter, GetSolr never fetches newly added documents.-
Although the timestamp rage filter is not properly formatted with square
brackets, GetSolr seems to manage fetching newly added documents. So, I lowered
the priority.
The code has been the same in the [0.5
branch|https://github.com/apache/nifi/blob/support/nifi-0.5.x/nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java#L202],
so it seems it hasn't been working as expected.
{code}
// if initialized then apply a filter to restrict results from the last
end time til now
if (initialized) {
StringBuilder filterQuery = new StringBuilder();
filterQuery.append(context.getProperty(DATE_FIELD).getValue())
// This should be a square bracket :[
.append(":{").append(lastEndDatedRef.get()).append(" TO ")
.append(currDate).append("]");
solrQuery.addFilterQuery(filterQuery.toString());
logger.info("Applying filter query {}", new
Object[]{filterQuery.toString()});
}
{code}
was:
GetSolr holds the last query timestamp so that it only fetches documents those
have been added or updated since the last query.
- However, since the timestamp filter is not properly formatted as a valid time
range filter, GetSolr never fetches newly added documents.-
Although the timestamp rage filter is not properly formatted with square
brackets, GetSolr seems to manage fetching newly added documents. So, I lowered
the priority.
The code has been the same in the [0.5
branch|https://github.com/apache/nifi/blob/support/nifi-0.5.x/nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java#L202],
so it seems it hasn't been working as expected.
{code}
// if initialized then apply a filter to restrict results from the last
end time til now
if (initialized) {
StringBuilder filterQuery = new StringBuilder();
filterQuery.append(context.getProperty(DATE_FIELD).getValue())
// This should be a square bracket :[
.append(":{").append(lastEndDatedRef.get()).append(" TO ")
.append(currDate).append("]");
solrQuery.addFilterQuery(filterQuery.toString());
logger.info("Applying filter query {}", new
Object[]{filterQuery.toString()});
}
{code}
> GetSolr cannot query newly added documents
> ------------------------------------------
>
> Key: NIFI-3248
> URL: https://issues.apache.org/jira/browse/NIFI-3248
> Project: Apache NiFi
> Issue Type: Bug
> Components: Extensions
> Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1,
> 1.0.1
> Reporter: Koji Kawamura
> Priority: Minor
> Attachments: nifi-flow.png, query-result-with-curly-bracket.png,
> query-result-with-square-bracket.png
>
>
> GetSolr holds the last query timestamp so that it only fetches documents
> those have been added or updated since the last query.
> However, GetSolr misses some of those updated documents, and once the
> documents date field value becomes older than last query timestamp, the
> document won't be able to be queried by GetSolr any more.
> This JIRA is for tracking the process of investigating this behavior, and
> discussion on them.
> Here are things that can be a cause of this behavior:
> |#|Short description|Should we address it?|
> |1|Timestamp range filter, curly or square bracket?|No|
> - However, since the timestamp filter is not properly formatted as a valid
> time range filter, GetSolr never fetches newly added documents.-
> Although the timestamp rage filter is not properly formatted with square
> brackets, GetSolr seems to manage fetching newly added documents. So, I
> lowered the priority.
> The code has been the same in the [0.5
> branch|https://github.com/apache/nifi/blob/support/nifi-0.5.x/nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java#L202],
> so it seems it hasn't been working as expected.
> {code}
> // if initialized then apply a filter to restrict results from the
> last end time til now
> if (initialized) {
> StringBuilder filterQuery = new StringBuilder();
> filterQuery.append(context.getProperty(DATE_FIELD).getValue())
> // This should be a square bracket :[
> .append(":{").append(lastEndDatedRef.get()).append(" TO ")
> .append(currDate).append("]");
> solrQuery.addFilterQuery(filterQuery.toString());
> logger.info("Applying filter query {}", new
> Object[]{filterQuery.toString()});
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)