[GitHub] [incubator-pinot] siddharthteotia edited a comment on issue #4993: Support Text column type in Pinot (both offline and realtime)
siddharthteotia edited a comment on issue #4993: Support Text column type in Pinot (both offline and realtime) URL: https://github.com/apache/incubator-pinot/pull/4993#issuecomment-585427783 @mcvsubbu , I have addressed the review comments: Latest changes: - Move the realtime state handler to a separate class RealtimeLuceneIndexRefreshState. This manages the global queue. - The background task is started only if we have a TEXT index column. - Other review comments - Rebased on my [previous FieldConfig change](https://github.com/apache/incubator-pinot/pull/5006). Text index creation info for a column is now specified as part of FieldConfig. - Moved index files under v3/ **Major TODO:** Handle segment reload. Once this PR goes in, I will come up with a follow-up PR immediately to address that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia edited a comment on issue #4993: Support Text column type in Pinot (both offline and realtime)
siddharthteotia edited a comment on issue #4993: Support Text column type in Pinot (both offline and realtime) URL: https://github.com/apache/incubator-pinot/pull/4993#issuecomment-585427783 @mcvsubbu , I have addressed the review comments: Latest changes: - Move the realtime state handler to a separate class RealtimeLuceneIndexRefreshState. This manages the global queue. - The background task is started only if we have a TEXT index column. - Other review comments - Rebased on my previous FieldConfig change. Text index creation info for a column is now specified as part of FieldConfig. - Moved index files under v3/ **Major TODO:** Handle segment reload. Once this PR goes in, I will come up with a follow-up PR immediately to address that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia edited a comment on issue #4993: Support Text column type in Pinot (both offline and realtime)
siddharthteotia edited a comment on issue #4993: Support Text column type in Pinot (both offline and realtime) URL: https://github.com/apache/incubator-pinot/pull/4993#issuecomment-585427783 @mcvsubbu , I have addressed the review comments: Latest changes: - Move the realtime state handler to a separate class RealtimeLuceneIndexRefreshState. This manages the global queue. - The background task is started only if we have a TEXT index column. - Other review comments - Rebased on my previous FieldConfig change. Text index creation info for a column is now specified as part of FieldConfig. **Major TODO:** Handle segment reload. Once this PR goes in, I will come up with a follow-up PR immediately to address that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia edited a comment on issue #4993: Support Text column type in Pinot (both offline and realtime)
siddharthteotia edited a comment on issue #4993: Support Text column type in Pinot (both offline and realtime) URL: https://github.com/apache/incubator-pinot/pull/4993#issuecomment-585427783 @mcvsubbu , I have addressed the review comments: Latest changes: - Move the realtime state handler to a separate class RealtimeLuceneIndexRefreshState. This manages the global queue. - The background task is started only if we have a TEXT index column. - Other review comments **Major TODO:** Handle segment reload. Once this PR goes in, I will come up with a follow-up PR immediately to address that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia edited a comment on issue #4993: Support Text column type in Pinot (both offline and realtime)
siddharthteotia edited a comment on issue #4993: Support Text column type in Pinot (both offline and realtime) URL: https://github.com/apache/incubator-pinot/pull/4993#issuecomment-576459692 > Related comment in PR 4954. I don't have an answer, but we do need to resolve it > > Also, are we supporting all of Lucene's expressions in text search? How do we document what we support here? > > It may help if the PR is broken into smaller PRs, for easier review. > > It will also help if we have unit tests for some newly added classes. I understand there is an integration test, but it is always useful to have more class level tests. > > There seem to be subtle multi-thread considerations, will be great if we can exercise some of that code path. TestTextSearchQueries have exhaustive tests that exercise all the code paths (including realtime multi-threaded, reference counting). They also exercise all the code paths for newly added modules. The cluster integration test also does that. The tests have been run in debug mode as well to verify the execution path and ensuring they are touching all paths. I will come up with a follow-up PR that will add new docs explaining the future and its usage. This PR need not be broken into smaller PRs since the functionality won't work otherwise -- the actual code change is only just under 3k. There is a 24k line text file for testing. In fact, some follow up work is still needed that will add the new data type. It is not included in this PR as noted in the description. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia edited a comment on issue #4993: Support Text column type in Pinot (both offline and realtime)
siddharthteotia edited a comment on issue #4993: Support Text column type in Pinot (both offline and realtime) URL: https://github.com/apache/incubator-pinot/pull/4993#issuecomment-576460048 > Also, is this supported in the SQL path? Yes it will be supported. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia edited a comment on issue #4993: Support Text column type in Pinot (both offline and realtime)
siddharthteotia edited a comment on issue #4993: Support Text column type in Pinot (both offline and realtime) URL: https://github.com/apache/incubator-pinot/pull/4993#issuecomment-576464652 TODOs: (1) Add the new data type. (2) Extend Calcite SQL grammar and parser to support compilation of TEXT_MATCH queries via Calcite. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] siddharthteotia edited a comment on issue #4993: Support Text column type in Pinot (both offline and realtime)
siddharthteotia edited a comment on issue #4993: Support Text column type in Pinot (both offline and realtime) URL: https://github.com/apache/incubator-pinot/pull/4993#issuecomment-576460048 > Also, is this supported in the SQL path? Yes it will be supported. As noted in my comments https://github.com/apache/incubator-pinot/pull/4993/files#r368180500, Calcite SQL grammar will have to extended to recognize the token TEXT_MATCH. It will be done in the follow-up PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org