Re: [Wikidata-tech] lexeme fulltext search display

2018-06-22 Thread Leszek Manicki
On 21 June 2018 at 21:05, Stas Malyshev  wrote:

> Hi!
>
> > In any case, thanks for bringing the issue up. Regarding link
> > formatting: this is the very topic we're working actively right now, in
> > a broader context, and all you need should be there rather soon. Hence
> > I'd prefer to avoid duplicating efforts now.
>
> Could you send me the phab task ID so I could subscribe to it and know
> when it's ready? I've paused the fulltext patch for now waiting for it,
>

Hi Stas,

it is going to take some time until this is all ready, so in order to
unblock your work I went ahead and had a look at the link formatter and
HtmlPageLinkRendererBeginHookHandler, as it is seemed pretty
straight-forward to have a basic working (hopefully) thing done on a Friday
afternoon.
You could see what I came up with at
https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/WikibaseLexeme/+/441554/
.

I hope that help with the search work. Feel free to mention any missing
bits or issues. Formatter is very rudimentary for now (e.g. "title"
attribute is intentionally just showing form ID), but I hope this is an
implementation detail that can be added later (we'll take care of the
formatter).

Have a good weekend!




> --
> Stas Malyshev
> smalys...@wikimedia.org
>



-- 
Leszek Manicki
Software Developer

Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Phone: +49 (0)30 219 158 26-0
http://wikimedia.de

Imagine a world, in which every single human being can freely share in the
sum of all knowledge. That‘s our commitment.

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


Re: [Wikidata-tech] lexeme fulltext search display

2018-06-18 Thread Leszek Manicki
On 19 June 2018 at 01:35, Stas Malyshev  wrote:

> Hi!
>
> > I can reimplement it manually, but I would be largely duplicating what
> > HtmlPageLinkRendererBeginHookHandler is supposed to do. The problem
> > seems to be that it is not doing it right. When the code works on the
> > link like /wiki/Lexeme:L2#L2-F1, it does this:
> >
> > $entityId = $foreignEntityId ?:
> > $this->entityIdLookup->getEntityIdForTitle( $target );
> >
> > Which produces back LexemeId instead of Form ID. It can't return Lexeme
> > ID since lexeme does not have content model, and getEntityIdForTitle
> > uses content model to get from Title to ID. So, I could duplicate all
> > this code but I don't particularly like it. Could we fix
> > HtmlPageLinkRendererBeginHookHandler instead maybe?
>
> Also, looks like Form actually doesn't have link-formatter-callback and
> its own link formatter code. So I wonder if there's an existing facility
> to format links to Forms? Leszek, do you have any information on this?
>

Hi Stas,
would \Wikibase\Lexeme\PropertyType\FormIdHtmlFormatter be of any help for
you in this case, at least temporarily?

In any case, thanks for bringing the issue up. Regarding link formatting:
this is the very topic we're working actively right now, in a broader
context, and all you need should be there rather soon. Hence I'd prefer to
avoid duplicating efforts now.
If this works for you Stas, my suggestion would be that you assume that the
link formatting code is there, and make sure the data needed to display the
link is in the search results etc. Once the ground work we're doing now is
finished, and Lydia has defined how the search result link etc should look,
we'll add it there.

Best



>
> Thanks,
> --
> Stas Malyshev
> smalys...@wikimedia.org
>



-- 
Leszek Manicki
Software Developer

Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Phone: +49 (0)30 219 158 26-0
http://wikimedia.de

Imagine a world, in which every single human being can freely share in the
sum of all knowledge. That‘s our commitment.

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


Re: [Wikidata-tech] lexeme fulltext search display

2018-06-18 Thread Stas Malyshev
Hi!

> I can reimplement it manually, but I would be largely duplicating what
> HtmlPageLinkRendererBeginHookHandler is supposed to do. The problem
> seems to be that it is not doing it right. When the code works on the
> link like /wiki/Lexeme:L2#L2-F1, it does this:
> 
> $entityId = $foreignEntityId ?:
> $this->entityIdLookup->getEntityIdForTitle( $target );
> 
> Which produces back LexemeId instead of Form ID. It can't return Lexeme
> ID since lexeme does not have content model, and getEntityIdForTitle
> uses content model to get from Title to ID. So, I could duplicate all
> this code but I don't particularly like it. Could we fix
> HtmlPageLinkRendererBeginHookHandler instead maybe?

Also, looks like Form actually doesn't have link-formatter-callback and
its own link formatter code. So I wonder if there's an existing facility
to format links to Forms? Leszek, do you have any information on this?

Thanks,
-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


Re: [Wikidata-tech] lexeme fulltext search display

2018-06-18 Thread Stas Malyshev
Hi!

> You can use an EntityTitleLookup to get the Title object for an EntityId. In
> case of a Form, it will point to the appropriate section. You can use the

OK, I see it's just adding form id as a fragment, so it's easy I guess.

> LinkRenderer service to make a link. Or you use an EntityIdHtmlLinkFormatter,
> which should do the right thing. You can get one from a
> OutputFormatValueFormatterFactory.

I can reimplement it manually, but I would be largely duplicating what
HtmlPageLinkRendererBeginHookHandler is supposed to do. The problem
seems to be that it is not doing it right. When the code works on the
link like /wiki/Lexeme:L2#L2-F1, it does this:

$entityId = $foreignEntityId ?:
$this->entityIdLookup->getEntityIdForTitle( $target );

Which produces back LexemeId instead of Form ID. It can't return Lexeme
ID since lexeme does not have content model, and getEntityIdForTitle
uses content model to get from Title to ID. So, I could duplicate all
this code but I don't particularly like it. Could we fix
HtmlPageLinkRendererBeginHookHandler instead maybe?
-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


Re: [Wikidata-tech] lexeme fulltext search display

2018-06-18 Thread Daniel Kinzler
Am 18.06.2018 um 19:25 schrieb Stas Malyshev:
> 1. What the link will be pointing to? I haven't found the code to
> generate the link to specific Form.

You can use an EntityTitleLookup to get the Title object for an EntityId. In
case of a Form, it will point to the appropriate section. You can use the
LinkRenderer service to make a link. Or you use an EntityIdHtmlLinkFormatter,
which should do the right thing. You can get one from a
OutputFormatValueFormatterFactory.

-- daniel

___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


Re: [Wikidata-tech] lexeme fulltext search display

2018-06-18 Thread Stas Malyshev
Hi!

>> color/colour (L123)
>> colors: plural for color (L123): English noun
> 
> I'd rather have this:
> 
>  colors/colours (L123-F2)
>  plural of color (L123): English noun

This part is a bit trickier since the title is still L123, so the system
now is generating the link for L123. I could override that, but I see
two questions:
1. What the link will be pointing to? I haven't found the code to
generate the link to specific Form. I could write a new one but if it'd
sit outside main classes it may be a fragile design.
2. This means overriding standard linking code and possibly
reimplementing part of it (depending on whether this code supports
generating Form link instead of Lexeme) - may again be a bit fragile.
Unless I find standard means to do it.

> Note that in place of "plural", you may have something like "3rd person,
> singular, past, conjunctive", derived from multiple Q-ids.

Yes, of course.

> Again, I don't think any highlighting is needed.

Not strictly speaking needed, but might be nice.

-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


Re: [Wikidata-tech] lexeme fulltext search display

2018-06-18 Thread Daniel Kinzler
Hi Stas!

Your proposal is pretty much what I envision.

Am 14.06.2018 um 19:39 schrieb Stas Malyshev:
> I plan to display Lemma match like this:
> 
> title (LN)
> Synthetic description
> 
> e.g.
> 
> color/colour (L123)
> English noun
> 
> Meaning, the first line with link would be standard lexeme link
> generated by Lexeme code (which also deals with multiple lemmas) and the
> description line is generated description of the Lexeme - just like in
> completion search.

Sounds perfect to me.

> The problem here, however, is since the link is
> generated by the Lexeme code, which has no idea about search, we can not
> properly highlight it. This can be solved with some trickery, probably,
> e.g. to locate search matches inside generated string and highlight
> them, but first I'd like to ensure this is the way it should be looking.

Do we really need the highlight? It does not seem critical to me for this use
case. Just "nice to have".

> More tricky is displaying the Form (representation) match. I could
> display here the same as above, but I feel this might be confusing.
> Another option is to display Form data, e.g. for "colors":
> 
> color/colour (L123)
> colors: plural for color (L123): English noun

I'd rather have this:

 colors/colours (L123-F2)
 plural of color (L123): English noun

Note that in place of "plural", you may have something like "3rd person,
singular, past, conjunctive", derived from multiple Q-ids.

> The description line features matched Form's representation and
> synthetic description for this form. Right now the matched part is not
> highlighted - because it will otherwise always be highlighted, as it is
> taken from the match itself, so I am not sure whether it should be or not.

Again, I don't think any highlighting is needed.

But, as you know, it's all up to Lydia to decide :)

-- daniel

-- 
Daniel Kinzler
Principal Platform Engineer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


[Wikidata-tech] lexeme fulltext search display

2018-06-14 Thread Stas Malyshev
Hi!

I am working now on Lexeme fulltext search. One of the unclear moments I
have encountered is how to display Lexemes as search results. I am
basing on assumption that we want to match both Lemmas and Forms (please
tell me if I'm wrong). Having the match, I plan to display Lemma match
like this:

title (LN)
Synthetic description

e.g.

color/colour (L123)
English noun

Meaning, the first line with link would be standard lexeme link
generated by Lexeme code (which also deals with multiple lemmas) and the
description line is generated description of the Lexeme - just like in
completion search. The problem here, however, is since the link is
generated by the Lexeme code, which has no idea about search, we can not
properly highlight it. This can be solved with some trickery, probably,
e.g. to locate search matches inside generated string and highlight
them, but first I'd like to ensure this is the way it should be looking.

More tricky is displaying the Form (representation) match. I could
display here the same as above, but I feel this might be confusing.
Another option is to display Form data, e.g. for "colors":

color/colour (L123)
colors: plural for color (L123): English noun

The description line features matched Form's representation and
synthetic description for this form. Right now the matched part is not
highlighted - because it will otherwise always be highlighted, as it is
taken from the match itself, so I am not sure whether it should be or not.

So, does this display look as what we want to produce for Lexemes? Is
there something that needs to be changed or improved? Would like to hear
some feedback.

Thanks,
-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech