Re: [Wikitech-l] [gsoc] splitting the img_metadata field into a new table
I'm going to use this message to respond to several people with this email, hopefully it doesn't become confusing. Markus wrote: [snip] (1) You use mediumblob for values. I'll be honest, I chose a type at random for that field. It needed to be long since it should be able to store rather long strings since some metadata formats don't have length limits on strings. (in that version of the new table plan anyway. based on feedback, I think I'll try to make my plan for tables much simpler) Each row in your table specifies...meta_qualifies In xmp you can have special type of properties that instead of being a property of the image, modify the meaning of another property. The example given in the spec was if you have a creator property, you could have a qualifer for that property named role that denotes if that author proerty is the singer, the writer, or whatever. Its most common use seems to be in if multiple thumbnails of the image are stored in xmp at different resolutions, it uses qualifiers to specify the resolutions of the different choices (which is a kind of moot example for us, as i don't think we want to be storing embeded thumbnails of the image in the db). the column was meant to be boolean flag to say if this property was a sub-property of the parent, or if it modified the meaning of the parent. But overall, I am quite excited to see this project progressing. Maybe we could have some more alignment between the projects later on (How about combining image metadata and custom wiki metadata about image pages in queries? :-) but for GSoC you should definitely focus on your core goals and solve this task as good as possible. Based on the comments I recieved I might be moving towards a more simple table layout which will probably be less aligned with SMW_light's goals, but I'd love to see more alignment where it fits into the goals of my project. Personally I've always thought that a lot of the smw stuff was rather cool. On Fri, May 28, 2010 at 3:28 PM, Neil Kandalgaonkar ne...@wikimedia.org wrote: [snip] Okay, I just wrote a little novel here, but please take it as just opening a discussion. I think you should try for a simpler design, but I'm open to discussion. After reading the comments so far I tend to agree that perhaps my original design was a bit more complicated than it needed to be. Scalability is pretty much the number one concern, so the simpler the better BLOBS OF SERIALIZED PHP ARE GOOD You should not be afraid of storing (some) data as serialized PHP, *especially* if it's a complex data structure. If the database doesn't need to query or index on a particular field, then it's a huge win NOT to parse it out into columns and reassemble it into PHP data structures on every access. GO FOR MEANINGFUL DATA, NOT DATA PRESERVATION Okay onto the next topic -- how you want to parse XMP out into a flat structure, with links between them. I think you were clever in how you tried to make the cost of storing the tree relatively minimal, but I just question whether it's necessary to store it at all, and whether this meets our needs. [snip] So we shouldn't attempt to make a meta-metadata-format that has all the features of all possible metadata formats. Instead we should just standardize on one, hardcoded, metadata format that's useful for our purposes, and then translate other formats to that format. The simplest thing is just a flat series of columns. In other words, something like this: [snip] And of course metadata formats differ, and not all metadata fields need to be queryable or indexable. It would be perfectly acceptable to parse out some common interesting metadata into columns, and leave all the other random stuff in a serialized PHP blob, much as we have today. That structure could be recursive or whatever floats your boat. Hmm, I like the idea of using the serialized blobs generally, and then exposing some special few interesting properties into another table. I was actually thinking that perhaps page_props could be used for this. Currently all it contains is the hidden category listings (well and theoretically any extension can house stuff there using $wgPagePropLinkInvalidations, but i have yet to see an extension use that, which is a little surprising as it seems like a wonderful way to make really cool extensions really easily). Although it seems as if that table is more meant for properties that change the behaviour of the page they belong to in some way (like __HIDDENCAT__), any metadata stored there would still be a property, so I don't think thats too abusing its purpose too much. Really there seems no reason to create a new table if that one will do fine. Thanks a lot for presenting your design here in detail. If you want to take it to a wiki I can reiterate some of this debate on your design's talk page. Thank you for responding, your post has given me a lot to think about. I still have a lot to learn about databases, and
Re: [Wikitech-l] [gsoc] splitting the img_metadata field into a new table
2010/5/31 bawolff bawolff...@gmail.com bawolff%2...@gmail.com I'm going to use this message to respond to several people with this email, hopefully it doesn't become confusing. Since you're managinf EXIF data, consider too the possibility to *add* to them some relevant Commons: metadata (name of the File: page, copyright, categories). It would be great, if possible, that downoloaded image would contain them. I'm far from deep into the matter, it is only an old, layman idea I get when approaching to the magics of EXIF. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [gsoc] splitting the img_metadata field into a new table
Hi Alex. Thats actually on my list of to do if I have time. Building a metadata editor for files on the wiki (probably in the form of an extension) would be in phase 2 of my project. (In my project proposal it was on the list of things to do if I have extra time). Cheers, bawolff On Mon, May 31, 2010 at 5:31 AM, Alex Brollo alex.bro...@gmail.com wrote: 2010/5/31 bawolff bawolff...@gmail.com I'm going to use this message to respond to several people with this email, hopefully it doesn't become confusing. Since you're managinf EXIF data, consider too the possibility to add to them some relevant Commons: metadata (name of the File: page, copyright, categories). It would be great, if possible, that downoloaded image would contain them. I'm far from deep into the matter, it is only an old, layman idea I get when approaching to the magics of EXIF. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [gsoc] splitting the img_metadata field into a new table
On 31 May 2010 14:14, bawolff bawolff...@gmail.com wrote: I'm going to use this message to respond to several people with this email, hopefully it doesn't become confusing. Hmm, I like the idea of using the serialized blobs generally, and then exposing some special few interesting properties into another table. I was actually thinking that perhaps page_props could be used for this. Currently all it contains is the hidden category listings (well and theoretically any extension can house stuff there using $wgPagePropLinkInvalidations, but i have yet to see an extension use that, which is a little surprising as it seems like a wonderful way to make really cool extensions really easily). Although it seems as if that table is more meant for properties that change the behaviour of the page they belong to in some way (like __HIDDENCAT__), any metadata stored there would still be a property, so I don't think thats too abusing its purpose too much. Really there seems no reason to create a new table if that one will do fine. [...] I think the page_props table would be the best way to implement bug 8298. Actually i was reading up on the page_props table the other day, and I believe that in the commit implementing that table, bug 8298 was given as an example of something cool the table could be used to implement. I tried to use page_props once. I did end up using my own table, since the parser thinks it owns the page_props table and when page is parsed it happily deletes all values stored in page_props it doesn't know about. -Niklas -- Niklas Laxström ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [gsoc] splitting the img_metadata field into a new table
2010/5/31 Niklas Laxström niklas.laxst...@gmail.com: I tried to use page_props once. I did end up using my own table, since the parser thinks it owns the page_props table and when page is parsed it happily deletes all values stored in page_props it doesn't know about. The parser does indeed own the page_props table. It's intended for storing properties that can be derived at parse time and set by the parser itself or a parser hook through $parserOutput-setProperty($name, $value) . Roan Kattouw (Catrope) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [gsoc] splitting the img_metadata field into a new table
On Mon, May 31, 2010 at 7:04 AM, Roan Kattouw roan.katt...@gmail.com wrote: 2010/5/31 Niklas Laxström niklas.laxst...@gmail.com: I tried to use page_props once. I did end up using my own table, since the parser thinks it owns the page_props table and when page is parsed it happily deletes all values stored in page_props it doesn't know about. The parser does indeed own the page_props table. It's intended for storing properties that can be derived at parse time and set by the parser itself or a parser hook through $parserOutput-setProperty($name, $value) . Roan Kattouw (Catrope) Ah, I was thinking that looked like something too perfectly fitted to the situation to be true. However I still think it may be a good approach to generally keep the metadata as a serialized php blob, and have another table, similar looking to page_props to store specific metadata values of interest. cheers, bawolff ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Anyone with CSS fu that can help out on Flagged Revs?
On Mon, May 31, 2010 at 1:38 AM, Maciej Jaros e...@wp.pl wrote: Hm... Indeed... Weird. I haven't noticed that Wiki is shown in compatibility view just because of the by domain settings... That alone at least doesn't change much in horizontal positions of elements (only the search bar seems to be affected). The site might be in IE7 compatibility mode, but it still should use correct box widths. Still, using CSS for positioning is always risky unless an element to be positioned is anchored inside an element over which you want to position it. More precisely, absolute positioning is only a good idea if a) you know for sure which element will act as the anchor, and b) really want a fixed offset from the borders of that element without regard to any other elements that may be present. Neither one of those two seems to be the case here, so absolute positioning doesn't seem ideal. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] revision of Nuke that works with postgres
The Nuke extension doesn't work with postgres (https:// bugzilla.wikimedia.org/show_bug.cgi?id=23600). Is there a revision that contains a version that does? Right now (for 1.13.2) the snapshot returned by the Nuke extension page is: r37906. This produces the error given in the bug ticket. Regards, -- -- Dan Nessett ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] a wysiwyg editor for wikipedia?
Dear all, I've started to develop a simple wysiwyg editor that could be useful to wikipedia. Basically the editor gets the wiki code from wikipedai and builds the html on client side. Then you can edit the html code as you can imagine and when you are done another script converts the html back to wiki code. There is a simple demo here : http://www.corefarm.com:8080/wysiwyg?article=Open_innovation . You can try other pages from http://www.corefarm.com:8080/ (type the article name). It's far from being really usable now but do you think that such a tool would be useful ? The global structure is ok, most of the buttons are working (even if there are no special images to figure out what they actually do); it's just a matter of filling the gaps and support all the wikipedia syntax. You comments are welcomed! All the best, William ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] a wysiwyg editor for wikipedia?
On 31 May 2010 22:53, William Le Ferrand will...@corefarm.com wrote: I've started to develop a simple wysiwyg editor that could be useful to wikipedia. Basically the editor gets the wiki code from wikipedai and builds the html on client side. Then you can edit the html code as you can imagine and when you are done another script converts the html back to wiki code. There is a simple demo here : http://www.corefarm.com:8080/wysiwyg?article=Open_innovation . You can try other pages from http://www.corefarm.com:8080/ (type the article name). You comments are welcomed! What you're working on is a *hard* problem a lot of people have attempted, with varying success. There's been some discussion of this of late on mediawiki-l. The default editor on new wikia.com wikis is WYSIWYG-only. This works tolerably well (a bit buggy, but actively worked on). The common WYSIWYG solution for MediaWiki is FCKeditor, which works *almost* pretty well but: (a) falls down badly when you try to mix WYSIWYG editing with wikitext editing and chews up the wikitext (b) doesn't cope very well with the weirdest stuff done on English Wikipedia, where wikitext is tortured horribly to squeeze out every possible emergent side-effect for editor's use (c) is not actually being worked on at the moment. (Though one mediawiki-l contributor says he's been using it to good effect on his work intranet and is seeking permission to release back his changes under GPL.) http://mediawiki.fckeditor.net/ http://www.mediawiki.org/wiki/Extension:FCKeditor_%28Official%29 FCK+MediaWiki discussion recently: http://lists.wikimedia.org/pipermail/mediawiki-l/2010-May/thread.html#33896 http://lists.wikimedia.org/pipermail/mediawiki-l/2010-May/thread.html#34061 You may care to have a look at FCKeditor+MediaWiki and see if you've just reinvented the wheel and can help get that up to scratch. Or, alternately, if your approach is different enough and better enough to pursue nevertheless ;-) - d. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] a wysiwyg editor for wikipedia?
On Mon, May 31, 2010 at 5:53 PM, William Le Ferrand will...@corefarm.com wrote: I've started to develop a simple wysiwyg editor that could be useful to wikipedia. Basically the editor gets the wiki code from wikipedai and builds the html on client side. Then you can edit the html code as you can imagine and when you are done another script converts the html back to wiki code. Wiki syntax is too complicated for this to be feasible. It also doesn't have a one-to-one mapping to HTML. It's been tried before, but what you end up with is that it doesn't round-trip: if you open in the WYSIWYG editor and save with no changes, it saves totally different wikicode, confusing anyone who's using actual wikitext. The only feasible solutions are to either drastically simplify wikitext, or switch to WYSIWYG only, and those would both be very disruptive. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] a wysiwyg editor for wikipedia?
On 31 May 2010 23:31, Aryeh Gregor simetrical+wikil...@gmail.com wrote: Wiki syntax is too complicated for this to be feasible. It also doesn't have a one-to-one mapping to HTML. It's been tried before, but what you end up with is that it doesn't round-trip: if you open in the WYSIWYG editor and save with no changes, it saves totally different wikicode, confusing anyone who's using actual wikitext. The only feasible solutions are to either drastically simplify wikitext, or switch to WYSIWYG only, and those would both be very disruptive. ... and the problem with the wikitext being mangled is that diffs show a 100% text change for any alteration whatsoever and become useless. Even if the content isn't semantically mangled. FCK+MW is so *almost* there it's frustrating how close it seems. But then, this is a problem where the last 5% of the list is the last 95% of the work. (Will *this* be what tempts me to take up coding? Be afraid ... I have the algorithmic insight of a sysadmin and only know about force and brute force ...) - d. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] a wysiwyg editor for wikipedia?
The default editor on new wikia.com wikis is WYSIWYG-only. This works tolerably well (a bit buggy, but actively worked on). It's not WYSIWYG only. You can switch between that and regular wikitext whilst you're on the edit page using the source button, or you can select which you want to use in your preferences. A new version was released a couple of weeks ago, so many of the older bugs are now resolved. There are some new ones of course, but it's improving all the time. :) Angela ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] a wysiwyg editor for wikipedia?
On 31 May 2010 23:50, Angela bees...@gmail.com wrote: The default editor on new wikia.com wikis is WYSIWYG-only. This works tolerably well (a bit buggy, but actively worked on). It's not WYSIWYG only. You can switch between that and regular wikitext whilst you're on the edit page using the source button, or you can select which you want to use in your preferences. A new version was released a couple of weeks ago, so many of the older bugs are now resolved. There are some new ones of course, but it's improving all the time. :) :-D How installable and workable is this on, say, any random work intranet with a fairly generic MW 1.16 installation? I may give it a go at work ... [cc to mediawiki-l] - d. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Mediawiki-l] a wysiwyg editor for wikipedia?
On Tue, Jun 1, 2010 at 8:55 AM, David Gerard dger...@gmail.com wrote: How installable and workable is this on, say, any random work intranet with a fairly generic MW 1.16 installation? I may give it a go at work As far as I know, the problem right now is that Wikia's rich text editor requires core changes, rather than simply being an extension you can install. The code is all available at https://svn.wikia-code.com/ but it still needs to be packaged up to make it easier for third-party use. ... and the problem with the wikitext being mangled is that diffs show a 100% text change for any alteration Luckily this is now a rare bug, and not the norm. Work is ongoing to resolve the remaining causes of this. The aim is for diffs to work in exactly the same way as they do with wikitext-only. Angela ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] a wysiwyg editor for wikipedia?
On 31 May 2010 23:37, David Gerard dger...@gmail.com wrote: On 31 May 2010 23:31, Aryeh Gregor simetrical+wikil...@gmail.com wrote: Wiki syntax is too complicated for this to be feasible. It also doesn't have a one-to-one mapping to HTML. It's been tried before, but what you end up with is that it doesn't round-trip: if you open in the WYSIWYG editor and save with no changes, it saves totally different wikicode, confusing anyone who's using actual wikitext. The only feasible solutions are to either drastically simplify wikitext, or switch to WYSIWYG only, and those would both be very disruptive. The other solution is to use a proper MVC framework, and define everything in terms of modifications to the wikitext (and you can then constrain what those modifications are to avoid mangling) and run that through a parser to generate the html preview. Alternatively, if your wikitext modifications are constrained enough, it is possible to implement modifications as a pair of functions, one of which edits the wikitext and the other edits the HTML (this is the method used by English Wiktionary for the translation adding interface - and makes undo/redo really easy). Building such a thing is time-consuming - particularly if you have to ensure that the wikitext modification and the HTML modification are the same - as there's a pretty large number of things people would like to do with wikitext. That said, it's pretty possible to use a wysiwyg for editing the contents of a paragraph, so you could have one action for change the content of in addition to actions for inserting/deleting and moving things around (in a perfect world, a wysiwyg would trigger constrained actions based on user-interaction - that is the hard part of this - the rest is just complicated). As there's already a javascript thing for general template arguments modifications (based on xml somehow), so this would be extendable to work with templates too. Conrad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] a wysiwyg editor for wikipedia?
On Tue, Jun 1, 2010 at 00:31, Aryeh Gregor simetrical+wikil...@gmail.com wrote: Wiki syntax is too complicated for this to be feasible. It also doesn't have a one-to-one mapping to HTML. It's been tried before, but what you end up with is that it doesn't round-trip: if you open in the WYSIWYG editor and save with no changes, it saves totally different wikicode, confusing anyone who's using actual wikitext. The only feasible solutions are to either drastically simplify wikitext, or switch to WYSIWYG only, and those would both be very disruptive. Another solution would be to identify a simple subset of wikitext for which the mapping to XHTML is one-to-one and refuse to work on anything else (i.e. revert to the standard editor). The rationale here is that a visual editor would (probably) be aimed at new editors, and they should probably avoid complicated syntax anyways. That's the approach we took on MeanEditor http://www.mediawiki.org/wiki/Extension:MeanEditor. In our experience, the biggest obstacle is to get the different browsers to reliably make the same changes to HTML. The editor interface is non-standard, and browsers sometimes disagree on encoding rules, escaping, choice of tags, etc. Anyways, there is a survey of existing approaches at http://www.mediawiki.org/wiki/WYSIWYG_editor. This might be useful to new editor developers, and if you find a cool idea it would be nice to contribute to the page. The usability project also did a survey last year: http://usability.wikimedia.org/wiki/Environment_Survey/MediaWiki_Extensions/Results. In the end, I think the FCKeditor developers did an amazing work, but I am still convinced that a simple (and hopefully reliable) HTML-based solution would have a purpose. Also, it's nice to be able to compare different designs. Bye, -- Jacopo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l