Hello Alex, although I'm only a lowly member of this mailing group, I feel honored that you've singled me out based on my last name and asking you for help with unsubscribing which I will promptly provide:
At the bottom of this email message you will find a link which I will report here: http://developer.marklogic.com/mailman/listinfo/general clicking on this link will lead you to a web page where in the bottom half you will find a section dedicated to unsubscribing from this list. It should be sufficient to enter your email address and click "unsubscribe". Actually I've just done it for you. You should have received a message with a link that you should click in order to confirm your request to unsubscribe. It has been a pleasure helping ... "Jakob'll Fix it" (tm) On Fri, Mar 27, 2015 at 5:12 PM, <[email protected]> wrote: > Hello Jacob Fix, > > Can you please remove me from the list? I ask you personally, because I have > asked generally for over two years, but I am still on it. Your last name is > "Fix," so maybe you can actually "Fix" it :-) > > Thanks, > --Alex > > -------- Original Message -------- > Subject: Re: [MarkLogic Dev General] question about > xdmp:encoding-language-detect > From: Jakob Fix <[email protected]> > Date: Fri, March 27, 2015 12:09 pm > To: MarkLogic Developer Discussion <[email protected]> > > Thanks for your respective answers. My concern is that I've tried two > other detection services, the obvious one which is Google's > translation service which detected the language automatically, and > another one called detectlanguage.com which provides an API which also > detected correctly the language in the exact same text sample that I > used with MarkLogic's language detection feature. > cheers, > Jakob. > > > On Fri, Mar 27, 2015 at 5:01 PM, Justin Makeig > <[email protected]> wrote: >> Jakob, >> Are there any other markers that are specific to your domain that could >> help you triangulate? The built-in detection doesn't (and can't) know the >> context of your business. Some pre- or post-detection analysis might help >> you to better narrow. For example, is a specific source known to not have >> Croatian or Serbian content, but might have Latvian? Are there entities >> (e.g. names, addresses, etc.) that are decent indicators of Latvian? I don't >> know the specifics of your app or content, but there might be other context >> that you could pull in to enhance the out-of-the-box identification. >> >> Justin >> >> >> -- >> Justin Makeig >> Director, Product Management >> MarkLogic >> [email protected] >> +1 (650) 655-2387 >> >>> On Mar 27, 2015, at 8:44 AM, Jakob Fix <[email protected]> wrote: >>> >>> Thanks Mary for your quick reply. It's an explanation that I >>> understand, but this doesn't resolve my initial problem. >>> Any idea how to solve this in the short term and whether there are >>> improvements in the pipeline? Or that it's not a high priority? >>> >>> cheers, >>> Jakob. >>> >>> >>> On Fri, Mar 27, 2015 at 4:34 PM, Mary Holstege >>> <[email protected]> wrote: >>>> On Fri, 27 Mar 2015 08:23:19 -0700, Jakob Fix <[email protected]> >>>> wrote: >>>> >>>>> Hello, I think this message got lost when the mailing list was down in >>>>> February (or nobody has an answer ...) >>>>> >>>>> Thanks, >>>>> Jakob. >>>> >>>> The xdmp:encoding-language-detect uses the ICU libraries to do the >>>> detection. Serbian and Croatian are very closely related to each other >>>> and >>>> have some similar orthography to Latvian (although not a great deal of >>>> linguistic similarity, it must be said). I think the ICU libraries >>>> probably lack some of the linguistic sophistication of Google's backend. >>>> >>>> It has nothing to do with the licensing options. >>>> >>>> //Mary >>>> >>>>> >>>>> ---------- Forwarded message ---------- >>>>> From: Jakob Fix <[email protected]> >>>>> Date: Sat, Feb 28, 2015 at 10:59 PM >>>>> Subject: question about xdmp:encoding-language-detect >>>>> To: General Mark Logic Developer Discussion >>>>> <[email protected]> >>>>> >>>>> >>>>> Hello, >>>>> >>>>> using ML7.0-3, the above function, given more than 3500 characters of >>>>> Latvian news story text, returns Croatian twice and Serbian once in >>>>> the top three results: >>>>> >>>>> <encoding-language xmlns="xdmp:encoding-language-detect"> >>>>> <encoding>utf-8</encoding> >>>>> <language>hr</language> >>>>> <score>7.081</score> >>>>> </encoding-language> >>>>> <encoding-language xmlns="xdmp:encoding-language-detect"> >>>>> <encoding>utf-8</encoding> >>>>> <language>hr</language> >>>>> <score>7.012</score> >>>>> </encoding-language> >>>>> <encoding-language xmlns="xdmp:encoding-language-detect"> >>>>> <encoding>utf-8</encoding> >>>>> <language>sr</language> >>>>> <score>6.882</score> >>>>> </encoding-language> >>>>> ... >>>>> >>>>> and no Latvian in sight. Google translate as well as >>>>> detectlanguage.com correctly and with sufficient self-assurance return >>>>> the correct result. >>>>> >>>>> Can someone explain what the reason behind this lack of confidence and >>>>> the wrong detection is? Do you need the right language pack (I'm >>>>> playing around with the developer licence which I thought is >>>>> full-featured)? Is this something that needs training? The doc doesn't >>>>> say so. >>>>> >>>>> Thanks! >>>>> >>>>> cheers, >>>>> Jakob. >>>>> _______________________________________________ >>>>> General mailing list >>>>> [email protected] >>>>> http://developer.marklogic.com/mailman/listinfo/general >>>> >>>> >>>> -- >>>> Using Opera's revolutionary email client: http://www.opera.com/mail/ >>>> _______________________________________________ >>>> General mailing list >>>> [email protected] >>>> http://developer.marklogic.com/mailman/listinfo/general >>> _______________________________________________ >>> General mailing list >>> [email protected] >>> http://developer.marklogic.com/mailman/listinfo/general >> >> >> >> >> >> _______________________________________________ >> General mailing list >> [email protected] >> http://developer.marklogic.com/mailman/listinfo/general >> > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
