Serkan Mulayim wrote on 6/8/17 7:12 PM:
Hi guys,
I would like to ask if it is possible to do regex queries (without adding new
fields, and tokenizing differently) in the C library. What I need to do is to
be able to be able to return documents based on file name suffix. So that a
query as (*.pdf) should return all documents that contain a PDF file type.
I can understand the complexity it creates for the searcher to do a suffix
query. But in my use case there would not be many files that are associated
with the documents. So that attachment fields will exist for small number of
documents.
If this is not possible, I will also index the documents with their file types
in a new field. (or reverse the attachment names).
afaik there is no C implementation of the Regex query. I wrote the Perl version.
https://metacpan.org/release/LucyX-Search-WildcardQuery
You will be *much* happier with storing the file extension as a separate field
and searching on that. Far far more efficient at search time than munging a regex.
--
Peter Karman . https://karpet.github.io . https://keybase.io/peterkarman