An explicit "description" tag eliminates the heuristic problem, but it has other problems I think.
It is markup reliant which raises the contributory bar and complicates any description editing UX. That is, when a user taps the edit pencil to the right of the description, instead of showing just the description in a simple editable text box with a small prompt to "Enter a concise description of 'article title'", you'd have to show the first section wikitext and explain the description markup. It also conflates two concerns, that of a concise description and some sub-portion of the first section text. I can appreciate the desire to write descriptive information only once, but this comes at a cost - changes to improve the quality of the description would have to also be proofed to ensure the changes also work in the sub-portion context. > On Mar 22, 2015, at 7:28 PM, Dmitry Brant <[email protected]> wrote: > > Hi Monte! > inline: > > > Deeply hard, in fact, because it's complicated not only by language syntax > > and grammatical rules, but also by qualitative factors (readability, > > meaning, context, relevance etc). > > This already complicated situation then becomes many orders of magnitude > > more difficult because these qualitative factors can differ between > > languages. > > Again, I agree that this is not an easy problem. However, in the case of > language translations, automated descriptions have the potential of > simplifying things tremendously. The algorithm for the grammar and syntax of > a certain language needs to be written only once. And once it's written, it > can be applied to every Wikidata item, past and future. Sure, there would > likely be a different algorithm for each language, and maybe even different > algorithms for various taxa of Wikidata items. But this kind of solution > simply feels more scalable, and I'm surprised that researching methods of > accomplishing this are of little interest. > > > > I predict this won't be any worse than what happened when we enabled > > section editing. > > But when we enabled section editing, did we do it with a prominent call to > action? I just feel a little hesitation about going full-on with something > like this, without having a baseline level of administrative feedback in the > apps (e.g. a notification for when a description is reverted, and the reason > for it). > > To be clear, of course I'm totally on board for experimenting with allowing > users to contribute descriptions. Making bold moves is what makes our team so > great. My goal is simply to point out various other solutions that, to me, > make slightly more sense (and to welcome feedback on why they don't!). > > > > But reducing the first sentence in this way is deceptively complicated to > > do programmatically, precisely because of the word "arguably" in the > > preceding sentence - it's almost entirely a matter of qualitative > > judgement. You have to know what a fish is to know what parts of the first > > sentence are most important > > That's almost convincing :) but still... why duplicate content when the > essential information is already there? > Maybe I didn't convey my idea of "markup" for extracting a description > properly. For example, the description for the [[Fish]] article can be marked > up as follows: > > A fish is any member of a paraphyletic group of organisms that consist of all > <description>gill-bearing aquatic craniate animal</description>s that lack > limbs with digits. > > The above markup would be done by a human editor, with the knowledge that the > text within the <description> tag will end up as the Wikidata description. I > would wager that a similar scheme could be applied to any number of articles. > Let's try it for a few random articles: > > [[Poland]] > Poland (Polish: Polska; pronounced [ˈpɔlska] ( listen)), officially the > Republic of Poland (Polish: Rzeczpospolita Polska; pronounced > [ʐɛt͡ʂpɔˈspɔʎit̪a ˈpɔlska] ( listen)), is a <description>country in Central > Europe</description> bordered by Germany to the west; the Czech Republic and > Slovakia to the south... > > [[Schadenfreude]] > Schadenfreude (/ˈʃɑːdənfrɔɪdə/; German: [ˈʃaːdn̩ˌfʀɔɪ̯də] ( listen)) is > <description>pleasure derived from the misfortunes of > others</description>.[1] This word is taken from German... > > [[Ming dynasty]] > The Ming dynasty, also Empire of the Great Ming, was the <description>ruling > dynasty of China for 276 years (1368–1644)</description> following the > collapse of the Mongol-led Yuan dynasty... > > [[Homomorphism]] > In abstract algebra, a homomorphism is a <description>structure-preserving > map between two algebraic structures</description> (such as groups, rings, or > vector spaces)... > > ^^ What would be the downside(s) of doing something like that? > > > >> On Sun, Mar 22, 2015 at 9:37 PM, Monte Hurd <[email protected]> wrote: >> My previous reply was partial and accidentally sent - here's my actual reply >> :) >> >> >> >> >>> On Sun, Mar 22, 2015 at 1:53 PM, Dmitry Brant <[email protected]> wrote: >>> Hi Lydia, >>> >>> Indeed, there are many more Wikidata items than Wikipedia articles. >>> However, the users of our mobile apps only see Wikipedia articles in our >>> search results (at least for now), which means that they will only be able >>> to contribute descriptions to Wikidata items for which a Wikipedia article >>> exists. >> >> >> >> They are also used in "Recent" and "Nearby" and Vibha wants them in "Saved >> Pages" list as well. >> >> >> >> >>> No doubt, the description field is an important component of each Wikidata >>> entry. But, when there is a corresponding Wikipedia article, why not query >>> it to provide an automatic description? This could be based on the first >>> sentence of the article, or a subset of the first sentence, or some other >>> kind of metadata within the article. >> >> >> >> >> Why not query it to provide an automatic description? Because finding the >> best subset of the first sentence(s) isn't all there is to it. >> >> For example, take the enwiki "Fish" article. >> >> The first couple sentences are these: >> >> A fish is any member of a paraphyletic group of organisms that consist of >> all gill-bearing aquatic craniate animals that lack limbs with digits. >> Included in this definition are the living hagfish, lampreys, and >> cartilaginous and bony fish, as well as various extinct related groups. >> >> >> >> So if the we reduce the description to its first sentence we have: >> >> A fish is any member of a paraphyletic group of organisms that consist of >> all gill-bearing aquatic craniate animals that lack limbs with digits. >> >> >> >> Now, for the sake of argument, let's imagine the bold words below represent >> a best case scenario for a relevant subset of the first sentence: >> >> A fish is any member of a paraphyletic group of organisms that consist of >> all gill-bearing aquatic craniate animals that lack limbs with digits. >> >> >> >> So, we have "A fish is a gill-bearing aquatic animal", or you could reduce >> it further to "a gill-bearing aquatic animal". >> >> >> But reducing the first sentence in this way is deceptively complicated to do >> programmatically, precisely because of the word "arguably" in the preceding >> sentence - it's almost entirely a matter of qualitative judgement. You have >> to know what a fish is to know what parts of the first sentence are most >> important and then you have to know how to contextually stitch these words >> together according to rules of the language's grammar and syntax so they >> "read" nicely (see the word "a" and the "s" on the end of "animals"). >> >> Basically, great descriptions require a native speaker of the language with >> some skill at summarizing. This is such a low bar for humans that almost >> anyone could contribute quality descriptions. >> >> >> But, If descriptions are not human editable, then we are stuck with the >> limitations of whatever heuristics are used to auto-generate the description. >> >> >> >> >> >> >> >> >> >> >> >> >> >>> The key is that the description would stay with the article, which would >>> eliminate the need for duplication and synchronization. >>> >>> So, in a sense, I would look at it the other way: descriptions within >>> Wikipedia articles would be useful for Wikidata entries. >>> >>> -Dmitry >>> >>>> On Sun, Mar 22, 2015 at 4:17 PM, Lydia Pintscher >>>> <[email protected]> wrote: >>>> On Sun, Mar 22, 2015 at 9:10 PM, Dmitry Brant <[email protected]> wrote: >>>> > Hi Jane, >>>> > >>>> > Perhaps my comments came off as more pessimistic than I intended. Of >>>> > course >>>> > I believe in the power of crowdsourcing, and I would never want to make >>>> > anyone feel like their contributions are being marginalized. >>>> > >>>> > I'll agree for now that the idea of "fully" automated descriptions leans >>>> > more towards science fiction than reality. :) >>>> > >>>> > However, my whole point has more to do with the apparent duplication of >>>> > content that seems to be happening between the first sentence of >>>> > Wikipedia >>>> > articles and the corresponding Wikidata description. There's something >>>> > about it that seems unnecessary. If we can figure out a way to >>>> > automatically extract the description from the first sentence of the >>>> > article, it would simplify things in two ways: >>>> > >>>> > 1) People wouldn't need to edit Wikidata descriptions, and would instead >>>> > focus on improving the Wikipedia article. >>>> > 2) People who monitor changes made to articles would need to monitor only >>>> > the article, instead of the article plus its corresponding Wikidata >>>> > description. >>>> >>>> There are a lot more items on Wikidata than articles on Wikipedia. And >>>> not every language has a Wikipedia article for each item. Don't just >>>> look at descriptions on Wikidata as something useful for Wikipedia. >>>> They're much more than that. >>>> >>>> >>>> Cheers >>>> Lydia >>>> >>>> -- >>>> Lydia Pintscher - http://about.me/lydia.pintscher >>>> Product Manager for Wikidata >>>> >>>> Wikimedia Deutschland e.V. >>>> Tempelhofer Ufer 23-24 >>>> 10963 Berlin >>>> www.wikimedia.de >>>> >>>> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. >>>> >>>> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg >>>> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das >>>> Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. >>>> >>>> _______________________________________________ >>>> Mobile-l mailing list >>>> [email protected] >>>> https://lists.wikimedia.org/mailman/listinfo/mobile-l >>> >>> >>> _______________________________________________ >>> Mobile-l mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/mobile-l >
_______________________________________________ Mobile-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/mobile-l
