D10694: epubextractor: Handle multiple subjects better

2018-12-05 Thread Stefan Brüns
bruns commandeered this revision. bruns edited reviewers, added: michaelh; removed: bruns. bruns added a comment. This revision is now accepted and ready to land. Picking this up ... REPOSITORY R286 KFileMetaData BRANCH multi-subject REVISION DETAIL https://phabricator.kde.org/D10694

D10694: epubextractor: Handle multiple subjects better

2018-05-09 Thread Stefan Brüns
bruns requested changes to this revision. bruns added a comment. This revision now requires changes to proceed. Restricted Application edited subscribers, added: Baloo, kde-frameworks-devel; removed: Frameworks. Discussion regarding subject vs keywords is still pending, i.e. no conclusion.

D10694: epubextractor: Handle multiple subjects better

2018-04-19 Thread Stefan Brüns
bruns added inline comments. INLINE COMMENTS > michaelh wrote in epubextractor.cpp:85 > I think we should port away from libepub. Multiple titles result in one > ';'-joined string. > Also it seems to be unmaintained. The joined titles is the fault of this epubextractor AFAICS - see

D10694: epubextractor: Handle multiple subjects better

2018-04-19 Thread Michael Heidelbach
michaelh added inline comments. INLINE COMMENTS > bruns wrote in epubextractor.cpp:85 > I think we should add each title individually (there may be one per language). > > Dito for all other properties, see below. I think we should port away from libepub. Multiple titles result in one

D10694: epubextractor: Handle multiple subjects better

2018-04-19 Thread Stefan Brüns
bruns added inline comments. INLINE COMMENTS > epubextractor.cpp:85 > > -QString value = fetchMetadata(ePubDoc, EPUB_TITLE); > +QString value = fetchMetadataString(ePubDoc, EPUB_TITLE); > if (!value.isEmpty()) { I think we should add each title individually (there may be one per

D10694: epubextractor: Handle multiple subjects better

2018-04-18 Thread Matthieu Gallien
mgallien accepted this revision. mgallien added a comment. This revision is now accepted and ready to land. Thanks REPOSITORY R286 KFileMetaData BRANCH multi-subject REVISION DETAIL https://phabricator.kde.org/D10694 To: michaelh, mgallien, dfaure Cc: bruns, astippich, #frameworks,

D10694: epubextractor: Handle multiple subjects better

2018-04-14 Thread Stefan Brüns
bruns added a comment. Sorry to join late here ... A QMap can store multiple values for one key, and a client reading the Map can use QMap::values() to get a list of all matching properties. If a client naively uses value() instead, it just gets the first value for the key, but so be

D10694: epubextractor: Handle multiple subjects better

2018-04-14 Thread Michael Heidelbach
michaelh updated this revision to Diff 32124. michaelh edited the summary of this revision. michaelh added a comment. Remove tests from epubextractortest use multivaluetest instead Depends on D12197 REPOSITORY R286 KFileMetaData CHANGES SINCE LAST

D10694: epubextractor: Handle multiple subjects better

2018-04-14 Thread Michael Heidelbach
michaelh edited the summary of this revision. REPOSITORY R286 KFileMetaData REVISION DETAIL https://phabricator.kde.org/D10694 To: michaelh, mgallien, dfaure Cc: bruns, astippich, #frameworks, ashaposhnikov, michaelh, spoorun, navarromorales, isidorov, firef, ngraham, andrebarros,

D10694: epubextractor: Handle multiple subjects better

2018-04-14 Thread Michael Heidelbach
michaelh added a dependency: D12197: autotests: Test for multiple values. REPOSITORY R286 KFileMetaData REVISION DETAIL https://phabricator.kde.org/D10694 To: michaelh, mgallien, dfaure Cc: bruns, astippich, #frameworks, ashaposhnikov, michaelh, spoorun, navarromorales, isidorov, firef,

D10694: epubextractor: Handle multiple subjects better

2018-03-30 Thread Michael Heidelbach
michaelh added a dependent revision: D11820: Handle properties with multiple values. REPOSITORY R286 KFileMetaData REVISION DETAIL https://phabricator.kde.org/D10694 To: michaelh, mgallien, dfaure Cc: astippich, #frameworks, ashaposhnikov, michaelh, spoorun, navarromorales, isidorov,

D10694: epubextractor: Handle multiple subjects better

2018-03-30 Thread Michael Heidelbach
michaelh updated this revision to Diff 30972. michaelh added a comment. - Additionally use keywords property This way the value-type of subject can be preserved REPOSITORY R286 KFileMetaData CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D10694?vs=30858=30972 BRANCH

D10694: epubextractor: Handle multiple subjects better

2018-03-30 Thread Michael Heidelbach
michaelh added a comment. The semicolons problem is fixed now. See T8196 REPOSITORY R286 KFileMetaData REVISION DETAIL https://phabricator.kde.org/D10694 To: michaelh, mgallien, dfaure Cc: astippich, #frameworks, ashaposhnikov, michaelh, spoorun,

D10694: epubextractor: Handle multiple subjects better

2018-03-29 Thread Michael Heidelbach
michaelh added a comment. Finally I understood how those multiple `add()`s were meant and it makes more sense now. We're still not good to go because of this: $ balooctl index test.epub Indexing ~/frameworks/kfilemetadata/autotests/samplefiles/test.epub File(s) indexed

D10694: epubextractor: Handle multiple subjects better

2018-03-29 Thread Michael Heidelbach
michaelh updated this revision to Diff 30858. michaelh edited the summary of this revision. michaelh added a comment. - Add multiple properies to map REPOSITORY R286 KFileMetaData CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D10694?vs=27632=30858 BRANCH multi-subject

D10694: epubextractor: Handle multiple subjects better

2018-03-13 Thread Michael Heidelbach
michaelh added a comment. A Either I'm completely off-track here or `kfilemetadata` is not doing this correctly. I see a lot of statements like `artist += ', ' + value` -> no list! `result->add` calls `QMap->addMulti().` This forces the client to iterate over the map, which is

D10694: epubextractor: Handle multiple subjects better

2018-03-13 Thread Matthieu Gallien
mgallien added a comment. In D10694#224465 , @michaelh wrote: > In D10694#224440 , @mgallien wrote: > > > Could you please update your diff and we can land it ? This is a useful improvement. >

D10694: epubextractor: Handle multiple subjects better

2018-03-13 Thread Michael Heidelbach
michaelh added a comment. In D10694#224440 , @mgallien wrote: > Could you please update your diff and we can land it ? This is a useful improvement. 1. We can't land it yet. baloo searching breaks with this patch. baloo has be adapted

D10694: epubextractor: Handle multiple subjects better

2018-03-13 Thread Matthieu Gallien
mgallien added a comment. In D10694#222674 , @michaelh wrote: > @mgallien: I think we have a beautiful misunderstanding here :-) > > In D10694#221719 , @michaelh wrote: > > > This is bad! ...

D10694: epubextractor: Handle multiple subjects better

2018-03-10 Thread Michael Heidelbach
michaelh added a comment. @mgallien: I think we have a beautiful misunderstanding here :-) In D10694#221719 , @michaelh wrote: > This is bad! ... I was referring to baloo inablility to handle string lists and not to this diff.

D10694: epubextractor: Handle multiple subjects better

2018-03-09 Thread Alexander Stippich
astippich added a comment. In D10694#221734 , @mgallien wrote: > In D10694#221719 , @michaelh wrote: > > > This is bad! I have learned baloo itself doesn't handle stringlists. Which in my view

D10694: epubextractor: Handle multiple subjects better

2018-03-09 Thread Michael Heidelbach
michaelh added a comment. In D10694#221734 , @mgallien wrote: > I believe this is quite the opposite. I am already getting strings list for some audio metadata. That would be great. Please point me to the respective code in elisa. I want

D10694: epubextractor: Handle multiple subjects better

2018-03-09 Thread Matthieu Gallien
mgallien added a comment. In D10694#221719 , @michaelh wrote: > This is bad! I have learned baloo itself doesn't handle stringlists. Which in my view would be the natural way to handle token-like items like tags, keywords and subject(s). Until

D10694: epubextractor: Handle multiple subjects better

2018-03-08 Thread Michael Heidelbach
michaelh added a comment. This is bad! I have learned baloo itself doesn't handle stringlists. Which in my view would be the natural way to handle token-like items like tags, keywords and subject(s). Until this is changed in baloo I'm putting this diff on hold and fro the time being work

D10694: epubextractor: Handle multiple subjects better

2018-03-08 Thread Matthieu Gallien
mgallien added a comment. Sorry for being so late. In D10694#215663 , @michaelh wrote: > @mgallien : taglibextractor.cpp is very hard to read. Hopefully it is no longer. INLINE COMMENTS > michaelh wrote in epubextractor.cpp:94 > Did you

D10694: epubextractor: Handle multiple subjects better

2018-02-27 Thread Matthieu Gallien
mgallien requested changes to this revision. mgallien added a comment. This revision now requires changes to proceed. In D10694#214996 , @michaelh wrote: > In D10694#213712 , @mgallien wrote: > > >

D10694: epubextractor: Handle multiple subjects better

2018-02-27 Thread Michael Heidelbach
michaelh added a comment. In D10694#213712 , @mgallien wrote: > To me, it is necessary to have a test that ensures that possible existing clients are not affected by your change. Could you add it ? The new behaviour will break clients.

D10694: epubextractor: Handle multiple subjects better

2018-02-26 Thread Matthieu Gallien
mgallien added a comment. I have created T8079 to work on baloo database update when extractors are modified and returned different data. REPOSITORY R286 KFileMetaData REVISION DETAIL https://phabricator.kde.org/D10694 To: michaelh, mgallien, dfaure

D10694: epubextractor: Handle multiple subjects better

2018-02-25 Thread Matthieu Gallien
mgallien added a comment. In D10694#210509 , @michaelh wrote: > The only component I could find to be affected by this change is `baloo-widgets`. I have already adapted it to this change. And yes, it will handle both. > It will take a some

D10694: epubextractor: Handle multiple subjects better

2018-02-20 Thread Michael Heidelbach
michaelh added a comment. The only component I could find to be affected by this change is `baloo-widgets`. I have already adapted it to this change. And yes, it will handle both. It will take a some time to publish it because some other stuff has to get reviewed first. I you know of

D10694: epubextractor: Handle multiple subjects better

2018-02-20 Thread Matthieu Gallien
mgallien added a comment. In D10694#210409 , @michaelh wrote: > @mgallien Then they won't benefit from this until a file is reindexed. > Users with baloo disabled will benefit, because in that case `baloo_filemetadata_temp_extractor` will

D10694: epubextractor: Handle multiple subjects better

2018-02-20 Thread Michael Heidelbach
michaelh added a comment. @mgallien Then they won't benefit from this until a file is reindexed. Users with baloo disabled will benefit, because in that case `baloo_filemetadata_temp_extractor` will extract the data on-the-fly. Reindexing just the ebooks can be done with: find

D10694: epubextractor: Handle multiple subjects better

2018-02-20 Thread Matthieu Gallien
mgallien added a comment. The problem with any changes to the extractors its that the baloo database of users will not be refreshed. REPOSITORY R286 KFileMetaData REVISION DETAIL https://phabricator.kde.org/D10694 To: michaelh, mgallien, dfaure Cc: #frameworks, ashaposhnikov, michaelh,

D10694: epubextractor: Handle multiple subjects better

2018-02-20 Thread Michael Heidelbach
michaelh created this revision. michaelh added reviewers: mgallien, dfaure. michaelh added projects: Baloo, Dolphin, Frameworks. Restricted Application added a subscriber: Frameworks. michaelh requested review of this revision. REVISION SUMMARY Instead of returing one long word of