[ 
https://issues.apache.org/jira/browse/SPARK-25232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592471#comment-16592471
 ] 

Lijie Xu commented on SPARK-25232:
----------------------------------

Compared to RLIKE, full text search has more powerful query that is useful for 
the tables with large volumes of text. RLIKE is as same as Ctrl+F, while 
full-text search is similar to search engines that perform query in NATURAL 
LANGUAGE mode using sophisticated information retrieval and NLP techniques. 
Different from RLIKE, full text search has many features such as fuzzy 
matching, query expansion, and result ranking. It can rank the search results 
with the matching scores. In other words, the rows returned are automatically 
sorted with the highest relevance first 
([https://dev.mysql.com/doc/refman/5.7/en/fulltext-search.html).] If we can 
support full-text search natively, the current  full-text queries in RDBMS can 
be easily transferred into SparkSQL queries without building external systems 
like elastic search.

Whether to add indexing in Spark SQL is an important issue. Indexing is 
efficient for interactive queries and also full-text search queries.

Thanks for providing information about SPIP. We may reorganize this proposal 
according to SPIP format if this issue is supported by the community.

 

 

> Support Full-Text Search in Spark SQL
> -------------------------------------
>
>                 Key: SPARK-25232
>                 URL: https://issues.apache.org/jira/browse/SPARK-25232
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 2.3.1
>            Reporter: Lijie Xu
>            Priority: Major
>
> Full-text search (i.e., keyword search) is widely used in search engines and 
> relational databases such as MATCH() ... AGAINST operator in MySQL 
> (https://dev.mysql.com/doc/en/fulltext-search.html), Text query in Oracle 
> (https://docs.oracle.com/cd/B28359_01/text.111/b28303/query.htm#g1016054), 
> and text search in PostgreSQL 
> (https://www.postgresql.org/docs/9.5/static/textsearch.html). However, it is 
> not natively supported in Spark SQL. We propose an approach to implement this 
> full-text search in Spark SQL.
> Our proposed approach is detailed  at 
> [https://github.com/JerryLead/Misc/blob/master/FullTextSearch/Full-text-issue-2018.pdf]
> and the prototype is available at 
> [https://github.com/bigdata-iscas/SparkFullTextQuery/tree/like_explorer]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to