#389: Strip wiki formatting from the Bloodhound Search results
-------------------------+-------------------------------------------------
Reporter: andrej | Owner: andrej
Type: | Status: assigned
enhancement | Milestone: Release 5
Priority: major | Version:
Component: search | Keywords: search bep-0004 bhsearch
Resolution: | bep-0004-beta
-------------------------+-------------------------------------------------
Comment (by andrej):
The primary source for indexing is DB. I we would need more data from wiki
markup, we can just reindex DB and add more fields. As alternative we can
store (not indexed) complete wiki fields but index and search stripped
version.
I suggest we proceed with index time stripping and change this if we will
see any drawbacks. We can re-index things on new features. What do you
think?
Replying to [comment:9 olemis]:
> Replying to [comment:4 jdreimann]:
> > Wouldn't this mean that we lose the information provided by wiki
formatting to rank results later? For example if a word appears styled as
a heading via wiki formatting it probably has a higher score then a work
that appears in a cell in a table (again via wiki formatting).
--
Ticket URL: <https://issues.apache.org/bloodhound/ticket/389#comment:10>
Apache Bloodhound <https://issues.apache.org/bloodhound/>
The Apache Bloodhound (incubating) issue tracker