Re: [Wikitech-l] Finding all elements with an attribute (Parsoid?..)

2014-03-19 Thread Amir E. Aharoni
Thanks, though I was talking about a *wiki*, not about a wiki page.


--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
‪“We're living in pieces,
I want to live in peace.” – T. Moore‬


2014-03-18 23:51 GMT+02:00 Arlo Breault :

> Maybe you're looking for,
>
> document.querySelectorAll("[lang]")
>
>
> On Sun, Mar 16, 2014 at 1:35 PM, Amir E. Aharoni <
> amir.ahar...@mail.huji.ac.il> wrote:
>
> > Hello,
> >
> > Is there an easy known way to find all HTML elements with an attribute
> that
> > appear in the text of a given wiki after it's parsed?
> >
> > Here's an example of something that I need:Find all elements that have
> the
> > HTML lang attribute, with any value. This would be useful for me for
> > collecting information about the multilingualism of Wikipedia - which
> > foreign languages do we incorporate in pages, how often we do it, for
> which
> > of them we may have various fonts problems, etc. This, again, must be
> > checked after the page is parsed - this attribute is very often inserted
> by
> > templates.
> >
> > Of course, this would rely on the editors actually using this attribute,
> > but this is fairly common, at least in the English Wikipedia. (Among
> other
> > things we could compare its usage between projects.)
> >
> > I could do this by analyzing a dump, but I've got a hunch that something
> > like this was already with the research that was done for Parsoid. Does
> > anybody know?
> >
> > Thanks!
> >
> > --
> > Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
> > http://aharoni.wordpress.com
> > ‪“We're living in pieces,
> > I want to live in peace.” – T. Moore‬
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Finding all elements with an attribute (Parsoid?..)

2014-03-18 Thread Arlo Breault
Maybe you're looking for,

document.querySelectorAll("[lang]")


On Sun, Mar 16, 2014 at 1:35 PM, Amir E. Aharoni <
amir.ahar...@mail.huji.ac.il> wrote:

> Hello,
>
> Is there an easy known way to find all HTML elements with an attribute that
> appear in the text of a given wiki after it's parsed?
>
> Here's an example of something that I need:Find all elements that have the
> HTML lang attribute, with any value. This would be useful for me for
> collecting information about the multilingualism of Wikipedia - which
> foreign languages do we incorporate in pages, how often we do it, for which
> of them we may have various fonts problems, etc. This, again, must be
> checked after the page is parsed - this attribute is very often inserted by
> templates.
>
> Of course, this would rely on the editors actually using this attribute,
> but this is fairly common, at least in the English Wikipedia. (Among other
> things we could compare its usage between projects.)
>
> I could do this by analyzing a dump, but I've got a hunch that something
> like this was already with the research that was done for Parsoid. Does
> anybody know?
>
> Thanks!
>
> --
> Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
> http://aharoni.wordpress.com
> ‪“We're living in pieces,
> I want to live in peace.” – T. Moore‬
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Finding all elements with an attribute (Parsoid?..)

2014-03-16 Thread Amir E. Aharoni
Hello,

Is there an easy known way to find all HTML elements with an attribute that
appear in the text of a given wiki after it's parsed?

Here's an example of something that I need:Find all elements that have the
HTML lang attribute, with any value. This would be useful for me for
collecting information about the multilingualism of Wikipedia - which
foreign languages do we incorporate in pages, how often we do it, for which
of them we may have various fonts problems, etc. This, again, must be
checked after the page is parsed - this attribute is very often inserted by
templates.

Of course, this would rely on the editors actually using this attribute,
but this is fairly common, at least in the English Wikipedia. (Among other
things we could compare its usage between projects.)

I could do this by analyzing a dump, but I've got a hunch that something
like this was already with the research that was done for Parsoid. Does
anybody know?

Thanks!

--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
‪“We're living in pieces,
I want to live in peace.” – T. Moore‬
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l