Re: [Wikidata-tech] Wikidata full text search

2018-05-31 Thread Stas Malyshev
Hi!

> Would there be any drawback with the following steps as way forward
> and possibility to learn more as we go?
> 1. We return results for the Lexeme namespace only when people
> explicitly select it

If you mean "it and only it" (as opposed to Lexemes + any other
namespace), then yes, this is doable and this is probably what I am
going to start with. However, a lot of people - as I observed with
several community members - tend to use "All" option and expect it to work.

> 2. We get feedback
> 3. We go the "Best possible query" route when people select all namespaces
> 4. We get feedback
> 5. We go the "Best possible query" route for all searches if feedback
> indicates this is useful (I don't know at this point)

I am not sure which mode is best for Wikidata now, there are at least
several plausible ways do go by default for Special:Search:
1. Search in Items only
2. Search in Items + Properties
3. Search in Items + Properties + Lexemes
4. Search in Items + Lexemes
5. Any of the above plus some of the article spaces (i.e. Wikidata or Help)

This requires mixed search working (except for 1 and 2) but is a
separate decision from it.
-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


Re: [Wikidata-tech] Wikidata full text search

2018-05-31 Thread Lydia Pintscher
Hey Stas,

Thanks for digging into this and writing it down!
Would there be any drawback with the following steps as way forward
and possibility to learn more as we go?
1. We return results for the Lexeme namespace only when people
explicitly select it
2. We get feedback
3. We go the "Best possible query" route when people select all namespaces
4. We get feedback
5. We go the "Best possible query" route for all searches if feedback
indicates this is useful (I don't know at this point)


Cheers
Lydia
On Thu, May 31, 2018 at 2:26 AM Stas Malyshev  wrote:
>
> Hi!
>
> While working on fulltext search for Lexemes, I have encountered a
> question which I think needs to be discussed and resolved. The question
> is how fulltext search should be working when dealing with different
> content models and what search should do by default and in specialized
> cases.
>
> The main challenge in Wikidata is that we are dealing with substantially
> different content models - articles, Items (including Properties,
> because while being formally different type, they are similar enough to
> Items for search to ignore the difference) and Lexemes organize their
> data in a different way, and should be searched using different
> specialized queries. This is currently unique for Wikidata, but SDC
> might eventually have the same challenge to deal with. I've described
> challenges and questions there are here in more detail:
>
> https://www.wikidata.org/wiki/User:Smalyshev_(WMF)/Wikidata_search#Fulltext_search
>
> I'd like to first hear some feedback about what are the expectations
> about the combined search are - what is expected to work, how it is
> expected to work, what are the defaults, what are the use cases for
> these. I have outlined some solutions that were proposed on wiki, if you
> have any comments please feel welcome to respond either here or on wiki.
>
> TLDR version of it is that doing search on different data models is
> hard, and we would need to sacrifice something to make it work. We need
> to figure out and decide which of these sacrifices are acceptable and
> what is enabled/disabled by default.
>
> Thanks,
> --
> Stas Malyshev
> smalys...@wikimedia.org



-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata


Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


[Wikidata-tech] Wikidata full text search

2018-05-30 Thread Stas Malyshev
Hi!

While working on fulltext search for Lexemes, I have encountered a
question which I think needs to be discussed and resolved. The question
is how fulltext search should be working when dealing with different
content models and what search should do by default and in specialized
cases.

The main challenge in Wikidata is that we are dealing with substantially
different content models - articles, Items (including Properties,
because while being formally different type, they are similar enough to
Items for search to ignore the difference) and Lexemes organize their
data in a different way, and should be searched using different
specialized queries. This is currently unique for Wikidata, but SDC
might eventually have the same challenge to deal with. I've described
challenges and questions there are here in more detail:

https://www.wikidata.org/wiki/User:Smalyshev_(WMF)/Wikidata_search#Fulltext_search

I'd like to first hear some feedback about what are the expectations
about the combined search are - what is expected to work, how it is
expected to work, what are the defaults, what are the use cases for
these. I have outlined some solutions that were proposed on wiki, if you
have any comments please feel welcome to respond either here or on wiki.

TLDR version of it is that doing search on different data models is
hard, and we would need to sacrifice something to make it work. We need
to figure out and decide which of these sacrifices are acceptable and
what is enabled/disabled by default.

Thanks,
-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech