On Четверг 21 марта 2013 11:48:14 Ignacio Serantes wrote: > On Wed, Mar 20, 2013 at 7:50 PM, <[email protected]> wrote: > > On Среда 20 марта 2013 19:08:04 Ignacio Serantes wrote: > > > On Wed, Mar 20, 2013 at 6:28 PM, <[email protected]> wrote: > > > > On Среда 20 марта 2013 17:11:24 Ignacio Serantes wrote: > > > > > There are several places to obtains lyrics in a predictable > > > > > format > > > > > with > > > > > > > > an > > > > > > > > > API, for example LyricWiki, even with support to multiple > > > > > languages. One example with kana, rōmaji and English > > > > > versions here: 宇多田ヒカル ー 光< > > > > http://lyrics.wikia.com/%E5%AE%87%E5%A4%9A%E7%94%B0%E3%83%92%E3%82%AB%E3 > > > > > > %8> > > > > > > > > > 3%AB_(Hikaru_Utada):%E5%85%89> . > > > > > > > > Wow... that's quite a nice format, and it contains some > > > > features which we definitely didn't think of when nmm was > > > > drafted. Makes me feel good about not > > > > prematurely standardizing on lyrics metadata. > > > > > > > > > There is too a file format to store synchronized lyrics, > > > > > http://en.wikipedia.org/wiki/LRC_(file_format) and there are > > > > > several servers offering this lyrics format. > > > > > > > > > > The problem here is not obtaining lyrics but how reliable is > > > > > the > > > > > subtitle because basically that databases are created by > > > > > people > > > > > like the> > > > > > > > > Wikipedia. > > > > > > > > Of course but if there's a critical mass, it's only a matter of > > > > time > > > > before it > > > > becomes good enough. Is there a critical mass for syncronized > > > > lyrics? > > > > Human- > > > > made lyrics translation is a very appealing feature, but are > > > > these also available in a synchronized format? > > > > > > For some titles yes, basically English translations. But, in my > > > case, my lyrics fetcher has a method to translate lyrics to any > > > language using google translate. So there is manual translation an > > > automatic > > > > translation. > > > > > And, off course, the same for romanization process that is more > > > easily > > > automatized but not reliable because, for example, Japanese > > > romanization> > > is > > > > > impossible so always needs a manual correction. > > > > > > Lyrics are like movies subtitles. They have an original language and > > > zero or several translations and for certain languages, like > > > Japanese or> > > Korean, > > > > > there is a romanized form to. The main difference is most of the > > > times > > > there are no timestamps except for LRC format. > > > > I don't yet know what is the best approach to handle multiple languages. > > N3 in > > theory lets you do tricks like {<uri> <property> literal@language. }, > > but > > is > > this supported in Nepomuk-KDE? Is there a language code for romanized > > japanese(most likely no:( )? > > As far as I know no for both. > > > > > And the million dollar question, if we standardize on LRC which > > > > looks > > > > quite backwards-compatible with plaintext: should we store it > > > > as-is in a single property? > > > > > > > > Any nasty corner cases we might not like? > > > > > > > > A good starting point for the discussion seems to be nmm:lyrics > > > > in LRC format > > > > + plaintext dump into nie:plainTextContent. > > > > > > If LRC lyrics and not LRC lyrics are supported seems good to me but > > > maybe we could use [00:00:00] when there is no timestamps. > > > > Non-LRC is already supported using nie:plainTextContent. There's simply > > no point in introducing a dedicated property without also placing some > > useful restrictions on it. it's hard to tell for me how broken the > > [00:00:00] approach is. > > > > But to make the right decision, we need to know for sure which formats > > have the critical mass... this is something you probably know better > > than me. > I don't understand you about the formats. LRC is widely used and for plain > text there is several apis available.
Ok so it is THE standard. Good to know. > About LRC the format is simple, the displayed text must be erased when > there is other timestamp so the next example don't broke LRC format: > [00:00.00]Line one > [00:00.00]Line two > [00:00.00]Line three > [99.60.60] > > This will display three lines for all the song duration but this will > required support for lrc format. Ok. > So in brief and for sure: > 1) nie:plainTextContent for plain lyrics Yes, for now. > For future development > 1) A new ontology, nmm:lyrics, for lrc format. > 2) Add support for transliteration/romanization. > 3) Add support for multiple languages. Sounds almost like a generic subtitle ontology. We should give it a shot. Now if only someone familiar with subtitles could provide some input... > > > > > Finally there are a bunch of lyrics fetchers because are > > > > > easy to > > > > > > > > implement > > > > > > > > > and even I wrote two, one deprecated for Amarok 2 written in > > > > > jscript, and the one I used in my daily basics written in > > > > > python. > > > > > > > > > > On Wed, Mar 20, 2013 at 4:52 PM, <[email protected]> wrote: > > > > > > On Среда 20 марта 2013 16:06:39 Ignacio Serantes wrote: > > > > > > > Extracted from ontology documentation: > > > > > > > > > > > > > > Plain-text representation of the content of a > > > > > > > InformationElement > > > > > > > with all markup removed. The main purpose of this > > > > > > > property > > > > > > > is > > > > > > > full-text indexing> > > > > > > > > > > > > and > > > > > > > > > > > > > search. Its exact content is considered > > > > > > > application-specific. The > > > > > > > user > > > > > > > > > > > > can > > > > > > > > > > > > > make no assumptions about what is and what is not > > > > > > > contained > > > > > > > within. > > > > > > > *Applications > > > > > > > should use more specific properties wherever > > > > > > > possible*. > > > > > > > > > > > > *wherever possible*. The rationale for not adding a > > > > > > specific > > > > > > property > > > > > > like nmm:lyrics was that such a property might be > > > > > > underspecified > > > > > > and > > > > > > effectively useless. Also, this would mean lots of > > > > > > content types > > > > > > would get their own "nicely named plain-text version of > > > > > > the > > > > > > data without any strict > > > > > > serialization > > > > > > requirements" properties without any useful result > > > > > > either. > > > > > > > > > > > > To put "The user can make no assumptions about what is > > > > > > and what > > > > > > is not contained within" into musical context: typical > > > > > > data > > > > > > ripped off a> > > > > > > > > lyrics > > > > > > > > > > site > > > > > > might contain lyrics only, or lyrics prepended with > > > > > > band, title > > > > > > or who knows > > > > > > what else, format can be quite "flexible" too, even > > > > > > worse if you > > > > > > use > > > > > > several > > > > > > lyrics sources. > > > > > > > > > > > > So, the user who knows what nmm:MusicPiece is, also > > > > > > knows that > > > > > > you can get a > > > > > > somewhat useful, but not machine-readable text dump in > > > > > > nie:plainTextContent which is likely to also contain > > > > > > lyrics, and > > > > > > that's exactly what you get from a > > > > > > typical lyrics site. > > > > > > > > > > > > Properly implemented lyrics needs a rather clean feed > > > > > > and who > > > > > > knows > > > > > > maybe > > > > > > it > > > > > > shouldn't even be implemented as a single text property. > > > > > > Maybe a > > > > > > subtitle-like > > > > > > approach "time-stamped text" is a better idea? > > > > > > > > > > > > Or, maybe I missed some important development and > > > > > > there's a very > > > > > > good > > > > > > authoritative lyrics DB with a predictable format and we > > > > > > should > > > > > > get > > > > > > started on > > > > > > defining nmm:lyrics? I don't monitor this actively... > > > > > > > > > > > > > When documentation informs you that other ontologies > > > > > > > should > > > > > > > be used > > > > > > > I > > > > > > > > > > > > have > > > > > > > > > > > > > doubts. > > > > > > > > > > > > > > On Wed, Mar 20, 2013 at 2:54 PM, <[email protected]> wrote: > > > > > > > > On Вторник 19 марта 2013 20:22:14 Ignacio Serantes wrote: > > > > > > > > > Hi list, > > > > > > > > > > > > > > > > > > As a first step to add music lyrics to > > > > > > > > > Nepomuk I > > > > > > > > > will add > > > > > > > > > support > > > > > > > > > for > > > > > > > > > lyrics frames in audio files in > > > > > > > > > taglibextractor and > > > > > > > > > this > > > > > > > > > data will > > > > > > > > > be > > > > > > > > > stored in > > > > > > > > > nie:plainTextContent< > > > > > > > > > > > > > > > > http://www.semanticdesktop.org/ontologies/nie/#p > > > > > > > > lainTe > > > > > > > > > > > > > > > > > xtContent> because there is no better place > > > > > > > > > to store > > > > > > > > > this > > > > > > > > > information. > > > > > > > > > > > > > > > > This is the proper place to store lyrics. _______________________________________________ Nepomuk mailing list [email protected] https://mail.kde.org/mailman/listinfo/nepomuk
