[Wikitech-l] Call for participation in OpenSym 2015, Aug 19-20, San Francisco!

2015-07-04 Thread Dirk Riehle

Call for participation in OpenSym 2015!

Aug 19-20, 2015, San Francisco, http://opensym.org



FOUR FANTASTIC KEYNOTES

Richard Gabriel (IBM) on Using Machines to Manage Public Sentiment on Social 
Media

Peter Norvig (GOOGLE) on Applying Machine Learning to Programs

Robert Glushko (UC BERKELEY) on Collaborative Authoring, Evolution, and 
Personalization


Anthony Wassermann (CMU SV) on Barriers and Pathways to Successful Collaboration

More at 
http://www.opensym.org/category/conference-contributions/keynotes-invited-talks/




GREAT RESEARCH PROGRAM

All core open collaboration tracks, including

- free/libre/open source
- open data
- Wikipedia
- wikis and open collaboration, and
- open innovation

More at 
http://www.opensym.org/2015/06/25/preliminary-opensym-2015-program-announced/




INCLUDING OPEN SPACE

The facilities provide room and space for your own working groups.



AT A WONDERFUL LOCATION

OpenSym 2015 takes place from Aug 19-20 at the Golden Gate Club of San 
Francisco, smack in the middle of the Presidio, with a wonderful view of the 
Golden Gate Bridge.


More at http://www.opensym.org/os2015/location/



REGISTRATION

Is simple, subsidized, and all-encompassing.

Find it here: http://www.opensym.org/os2015/registration/

Prices will go up after July 12th, so be sure to register early!



We would like to thank our sponsors Wikimedia Foundation, Google, TJEF, and 
the ACM.






___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] WikiSym + OpenSym 2013: Less than 2 weeks for Community Track Submissions

2013-05-07 Thread Dirk Riehle
 for the demo, a  specific description of what you plan 
to demo, what you hope to get out of demoing, and how the audience will 
benefit. A short note of any special technical requirements should be included.


Demo submissions will  be reviewed based on their relevance to the community. 
All accepted demos will given space at a joint demo session (90 minutes) 
during the conference.


Tutorials

Tutorials tutorials are half-day classes, taught by experts, designed to help 
professionals rapidly come up to speed on a specific technology or 
methodology. Tutorials can be lecture-oriented or participatory. Tutorial 
attendees deserve the highest standard of excellence in tutorial preparation 
and delivery. Tutorial presenters are typically experts in their chosen topic 
and experienced speakers skilled in preparing and delivering educational 
presentations. When selecting tutorials, we will consider the presenter’s 
knowledge of the proposed topic and past success at teaching it.




SUBMISSION INFORMATION AND INSTRUCTIONS

There are two submission deadlines, an early and a regular one. The early 
deadline is for those who need to know early that their community track 
submission has been accepted. This mostly applies to workshops that require a 
program committee and their own paper submission and review process (as 
opposed, for example, to walk-in workshops). Also, some may need the 
additional time to raise funds and acquire a visa.


Submissions should follow the standard ACM SIG proceedings format. For advice 
and templates, please see 
http://www.acm.org/sigs/publications/proceedings-templates. All papers must 
conform at time of submission to the formatting instructions and must not 
exceed the page limits, including all text, references, appendices and 
figures. All submissions must in PDF format.


All papers and proposals should be submitted electronically through EasyChair 
using the following URL: 
https://www.easychair.org/conferences/?conf=opensym2013community




SUBMISSION AND NOTIFICATION DEADLINES

* Early submission deadline: March 17, 2013
* Notification for early submissions: March 31, 2013
* Regular submission deadline: May 17, 2013
* Notification for regular submissions: May 31, 2013
* Camera-ready for both rounds: June 9, 2013

As long as it is May 17 somewhere on earth, your submission will be accepted.



COMMUNITY TRACK PROGRAM COMMITTEE

Chairs

Regis Barondeau (Université du Québec à Montréal)
Dirk Riehle (Friedrich-Alexander University Erlangen-Nürnberg)

--
Website: http://dirkriehle.com - Twitter: @dirkriehle
Ph (DE): +49-157-8153-4150 - Ph (US): +1-650-450-8550


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] programmatically extracting lists from list pages on Wikipedia

2011-11-22 Thread Dirk Riehle
Try the Sweble parser for extracting structured data from Wikitext
http://sweble.org

http://dirkriehle.com, +49 157 8153 4150, +1 650 450 8550
On Nov 22, 2011 9:35 PM, Fred Zimmerman zimzaz@gmail.com wrote:

 hi,

 I want to programmatically extract lists from list pages on Wikipedia. That
 is to say, if there is a page that mostly consists of a list (list of
 episodes, list of presidents, etc.) I want to be able to extract the list
 from the page, with article names/links.  Has anyone already done this? can
 anyone suggest a good strategy?

 FredZ
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Announcing Wikihadoop: using Hadoop to analyze Wikipedia dump files

2011-09-14 Thread Dirk Riehle
Hello everyone!

Wikihadoop sounds like a great project!

I wanted to point out that you can make it even more powerful for many 
research applications by combining it with the Sweble Wikitext parser.

Doing so, you could enable Wikipedia dump processing not only on the rough XML 
dump level, but on the fine grain individual element (bold piece, heading, 
paragraph, category, page, etc.) level.

You can learn more about Sweble here: http://sweble.org

Cheers,
Dirk


On 08/17/2011 06:58 PM, Diederik van Liere wrote:
 Hello!

 Over the last few weeks, Yusuke Matsubara, Shawn Walker, Aaron Halfaker and
 Fabian Kaelin (who are all Summer of Research fellows)[0] have worked hard
 on a customized stream-based InputFormatReader that allows parsing of both
 bz2 compressed and uncompressed files of the full Wikipedia dump (dump file
 with the complete edit histories) using Hadoop. Prior to WikiHadoop and the
 accompanying InputFormatReader it was not possible to use Hadoop to analyze
 the full Wikipedia dump files (see the detailed tutorial / background for an
 explanation why that was not possible).

 This means:
 1) We can now harness Hadoop's distributed computing capabilities in
 analyzing the full dump files.
 2) You can send either one or two revisions to a single mapper so it's
 possible to diff two revisions and see what content has been addded /
 removed.
 3) You can exclude namespaces by supplying a regular expression.
 4) We are using Hadoop's Streaming interface which means people can use this
 InputFormat Reader using different languages such as Java, Python, Ruby and
 PHP.

 The source code is available at: https://github.com/whym/wikihadoop
 A more detailed tutorial and installation guide is available at:
 https://github.com/whym/wikihadoop/wiki


 (Apologies for cross-posting to wikitech-l and wiki-research-l)

 [0] http://blog.wikimedia.org/2011/06/01/summerofresearchannouncement/


 Best,

 Diederik
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


-- 
Website: http://dirkriehle.com - Twitter: @dirkriehle
Ph (DE): +49-157-8153-4150 - Ph (US): +1-650-450-8550


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] WYSIWYG and parser plans (was What is wrong with Wikia's WYSIWYG?)

2011-05-03 Thread Dirk Riehle


On 05/03/2011 08:28 PM, Neil Harris wrote:
 On 03/05/11 19:44, MZMcBride wrote:
...
 The point is that the wikitext and its parsing should be completely separate
 from MediaWiki/PHP/HipHop/Zend.

 I think some of the bigger picture is getting lost here. Wikimedia produces
 XML dumps that contain wikitext. For most people, this is the only way to
 obtain and reuse large amounts of content from Wikimedia wikis (especially
 as the HTML dumps haven't been re-created since 2008). There needs to be a
 way for others to be able to very easily deal with this content.

 Many people have suggested (with good reason) that this means that wikitext
 parsing needs to be reproducible in other programming languages. While
 HipHop may be the best thing since sliced bread, I've yet to see anyone put
 forward a compelling reason that the current state of affairs is acceptable.
 Saying well, it'll soon be much faster for MediaWiki to parse doesn't
 overcome the legitimate issues that re-users have (such as programming in a
 language other than PHP, banish the thought).

 For me, the idea that all that's needed is a faster parser in PHP is a
 complete non-starter.

 MZMcBride


 I agree completely.

 I think it cannot be emphasized enough that what's valuable about
 Wikipedia and other similar wikis is the hard-won _content_, not the
 software used to write and display it at any given, which is merely a
 means to that end.

 Fashions in programming languages and data formats come and go, but the
 person-centuries of writing effort already embodied in Mediawiki's
 wikitext format needs to have a much longer lifespan: having a
 well-defined syntax for its current wikitext format will allow the
 content itself to continue to be maintained for the long term, beyond
 the restrictions of its current software or encoding format.

 -- Neil

+1 to both MZMcBride and Neil.

So relieved to see things put so eloquently.

Dirk


-- 
Website: http://dirkriehle.com - Twitter: @dirkriehle
Ph (DE): +49-157-8153-4150 - Ph (US): +1-650-450-8550


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Announcing the Open Source Sweble Wikitext Parser v1.0

2011-05-01 Thread Dirk Riehle
 You should identify whether you mean MediaWikitext, or some other
 dialect -- MediaWiki Is Not The Only Wiki...

 and you should post to wikitext-l as well.  The real parser maniacs hang
 out over there, even though traffic is low.

It is MediaWiki's Wikitext; elsewhere it is usually called wiki markup.

Cheers,
Dirk

-- 
Website: http://dirkriehle.com - Twitter: @dirkriehle
Ph (DE): +49-157-8153-4150 - Ph (US): +1-650-450-8550


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Announcing the Open Source Sweble Wikitext Parser v1.0

2011-05-01 Thread Dirk Riehle
 You should identify whether you mean MediaWikitext, or some other
 dialect -- MediaWiki Is Not The Only Wiki...

 and you should post to wikitext-l as well. The real parser maniacs
 hang out over there, even though traffic is low.

 It is MediaWiki's Wikitext; elsewhere it is usually called wiki
 markup.

 Improperly and incompletely, perhaps, yes.

 I'm a MW partisan, and think it's better than nearly all its competitors,
 for nearly all uses... but even I try not to be *that* partisan.

Hmm, never viewed it that way. IMO, MediaWiki (developers) invented a wiki 
markup language and called it Wikitext; other engines just call it wiki markup 
or what not. For me, Wikitext always was the particular markup of MediaWiki, 
much like php or C++ are particular language names.

Is there any other engine that calls it's markup Wikitext? I'd be surprised. 
Even for WikiCreole wikicreole.org we used wiki markup.

Cheers,
Dirk

-- 
Website: http://dirkriehle.com - Twitter: @dirkriehle
Ph (DE): +49-157-8153-4150 - Ph (US): +1-650-450-8550


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l