D11285: Introduce sanitizer class

2018-04-12 Thread Michael Heidelbach
michaelh added a task: T8250: Sanitize the database.

REPOSITORY
  R293 Baloo

REVISION DETAIL
  https://phabricator.kde.org/D11285

To: michaelh, #baloo, #frameworks, ngraham, mlaurent
Cc: mlaurent, ngraham, smithjd, ashaposhnikov, michaelh, astippich, spoorun, 
bruns, alexeymin


D11285: Introduce sanitizer class

2018-03-16 Thread Michael Heidelbach
This revision was automatically updated to reflect the committed changes.
Closed by commit R293:9de78becc063: Introduce sanitizer class (authored by 
michaelh).

REPOSITORY
  R293 Baloo

CHANGES SINCE LAST UPDATE
  https://phabricator.kde.org/D11285?vs=29683&id=29684

REVISION DETAIL
  https://phabricator.kde.org/D11285

AFFECTED FILES
  src/engine/CMakeLists.txt
  src/engine/Messages.sh
  src/engine/databasesanitizer.cpp
  src/engine/databasesanitizer.h
  src/engine/documenturldb.cpp
  src/engine/transaction.h

To: michaelh, #baloo, #frameworks, ngraham, mlaurent
Cc: mlaurent, ngraham, smithjd, ashaposhnikov, michaelh, astippich, spoorun, 
nicolasfella, alexeymin


D11285: Introduce sanitizer class

2018-03-16 Thread Laurent Montel
mlaurent accepted this revision.
mlaurent added a comment.
This revision is now accepted and ready to land.


  +2 for me but I don't know baloo code :)

REPOSITORY
  R293 Baloo

BRANCH
  sanitize-class (branched from master)

REVISION DETAIL
  https://phabricator.kde.org/D11285

To: michaelh, #baloo, #frameworks, ngraham, mlaurent
Cc: mlaurent, ngraham, smithjd, ashaposhnikov, michaelh, astippich, spoorun, 
nicolasfella, alexeymin


D11285: Introduce sanitizer class

2018-03-16 Thread Michael Heidelbach
michaelh marked 3 inline comments as done.

REPOSITORY
  R293 Baloo

REVISION DETAIL
  https://phabricator.kde.org/D11285

To: michaelh, #baloo, #frameworks, ngraham, mlaurent
Cc: mlaurent, ngraham, smithjd, ashaposhnikov, michaelh, astippich, spoorun, 
nicolasfella, alexeymin


D11285: Introduce sanitizer class

2018-03-16 Thread Michael Heidelbach
michaelh updated this revision to Diff 29683.
michaelh added a comment.


  - Turn FileInfo into a struct again

REPOSITORY
  R293 Baloo

CHANGES SINCE LAST UPDATE
  https://phabricator.kde.org/D11285?vs=29594&id=29683

BRANCH
  sanitize-class (branched from master)

REVISION DETAIL
  https://phabricator.kde.org/D11285

AFFECTED FILES
  src/engine/CMakeLists.txt
  src/engine/Messages.sh
  src/engine/databasesanitizer.cpp
  src/engine/databasesanitizer.h
  src/engine/documenturldb.cpp
  src/engine/transaction.h

To: michaelh, #baloo, #frameworks, ngraham, mlaurent
Cc: mlaurent, ngraham, smithjd, ashaposhnikov, michaelh, astippich, spoorun, 
nicolasfella, alexeymin


D11285: Introduce sanitizer class

2018-03-16 Thread Laurent Montel
mlaurent requested changes to this revision.
mlaurent added inline comments.
This revision now requires changes to proceed.

INLINE COMMENTS

> databasesanitizer.cpp:46
> +public:
> +FileInfo(const quint32 d = 0, const quint32 i = 0, 
> +const QString& u = QString(), const bool a = true)

you can replace this constructor by initialize directly variable

> bool accessible = true;
=

etc.

> it avoids to create this constructor
==

> databasesanitizer.cpp:99
> +QVector excludeIds;
> +for (const auto& deviceId : deviceIds) {
> +if (deviceId > 0) {

const auto & for a quint32
it's better to use directly quint32 without const'ref

REPOSITORY
  R293 Baloo

REVISION DETAIL
  https://phabricator.kde.org/D11285

To: michaelh, #baloo, #frameworks, ngraham, mlaurent
Cc: mlaurent, ngraham, smithjd, ashaposhnikov, michaelh, astippich, spoorun, 
nicolasfella, alexeymin


D11285: Introduce sanitizer class

2018-03-15 Thread Michael Heidelbach
michaelh added inline comments.

INLINE COMMENTS

> databasesanitizer.cpp:163
> +/**
> +* Create a list of \a FileInfo items to stdout.
> +* 

Before landing this will become

  Create a list of \a FileInfo items and print it to stdout.

REPOSITORY
  R293 Baloo

REVISION DETAIL
  https://phabricator.kde.org/D11285

To: michaelh, #baloo, #frameworks, ngraham, mlaurent
Cc: mlaurent, ngraham, smithjd, ashaposhnikov, michaelh, spoorun, nicolasfella, 
alexeymin


D11285: Introduce sanitizer class

2018-03-15 Thread Michael Heidelbach
michaelh added a comment.


  In D11285#226298 , @mlaurent wrote:
  
  > Is it possible to create an autotest for this class ?
  
  
  This is where it gets complicated.
  Essentially I can test only for sane databases, but that's useless.
  Or I need to mock an external drive and have control over its device id, 
which seems to be very difficult if at all possible.

REPOSITORY
  R293 Baloo

REVISION DETAIL
  https://phabricator.kde.org/D11285

To: michaelh, #baloo, #frameworks, ngraham, mlaurent
Cc: mlaurent, ngraham, smithjd, ashaposhnikov, michaelh, spoorun, nicolasfella, 
alexeymin


D11285: Introduce sanitizer class

2018-03-15 Thread Michael Heidelbach
michaelh marked 4 inline comments as done.
michaelh added a comment.


  There were some conflicts I had to solve with this. Because it is not 
completely clear to me where this is going I wanted to be safe and used a 
d-pointer.
  As a consequence this class does the printing which I prefer to be in the cli.
  Secondly I can't make DatabaseSanitizerImpl a friend of the Transaction 
class. Which leads to the `getDocuments()` in DatabaseSanitizer, which should 
not be part of the interface. In the future more of similar functions will be 
needed.
  Is there a better way to accomplish this?

INLINE COMMENTS

> mlaurent wrote in databasesanitizer.cpp:165
> use qCDebug(BALOO)

I get a linker error
`databasesanitizer.cpp:189: undefined reference to `BALOO()'`
Currently its not worth the trouble fixing it, because that method is going to 
change anyway.
Also I don't like to mix debug statement and printing to stderr.

REPOSITORY
  R293 Baloo

REVISION DETAIL
  https://phabricator.kde.org/D11285

To: michaelh, #baloo, #frameworks, ngraham, mlaurent
Cc: mlaurent, ngraham, smithjd, ashaposhnikov, michaelh, spoorun, nicolasfella, 
alexeymin


D11285: Introduce sanitizer class

2018-03-15 Thread Michael Heidelbach
michaelh updated this revision to Diff 29594.
michaelh added a comment.


  - Apply suggested changes
  - Put method descriptions in their correct place

REPOSITORY
  R293 Baloo

CHANGES SINCE LAST UPDATE
  https://phabricator.kde.org/D11285?vs=29405&id=29594

BRANCH
  sanitize-class (branched from master)

REVISION DETAIL
  https://phabricator.kde.org/D11285

AFFECTED FILES
  src/engine/CMakeLists.txt
  src/engine/Messages.sh
  src/engine/databasesanitizer.cpp
  src/engine/databasesanitizer.h
  src/engine/documenturldb.cpp
  src/engine/transaction.h

To: michaelh, #baloo, #frameworks, ngraham, mlaurent
Cc: mlaurent, ngraham, smithjd, ashaposhnikov, michaelh, spoorun, nicolasfella, 
alexeymin


D11285: Introduce sanitizer class

2018-03-15 Thread Laurent Montel
mlaurent requested changes to this revision.
mlaurent added inline comments.
This revision now requires changes to proceed.

INLINE COMMENTS

> databasesanitizer.cpp:44
> +struct FileInfo {
> +quint32 deviceId; 
> +quint32 inode;

initialize value by default please

> databasesanitizer.cpp:107
> +};
> +if ((includeIds.count() > 0  && 
> !includeIds.contains(info.deviceId))
> +|| (excludeIds.count() > 0  && 
> excludeIds.contains(info.deviceId))

includeIds.count() > 0  => use !isEmpty

> databasesanitizer.cpp:153
> +const bool missingOnly, 
> +const QSharedPointer urlFilter)
> +{

const QSharedPointer &
--^

> databasesanitizer.cpp:165
> +} else {
> +qDebug()  << "Skipping" << info.url;
> +Q_ASSERT(false);

use qCDebug(BALOO)

> databasesanitizer.cpp:174
> +}
> +err << i18n("Found %1 matching items", infos.count()) << endl;
> +

as you add a i18n in engine directory you need to extract them.
you missed to add a Message.sh in this subdirectory and loading message file.

REPOSITORY
  R293 Baloo

REVISION DETAIL
  https://phabricator.kde.org/D11285

To: michaelh, #baloo, #frameworks, ngraham, mlaurent
Cc: mlaurent, ngraham, smithjd, ashaposhnikov, michaelh, spoorun, nicolasfella, 
alexeymin


D11285: Introduce sanitizer class

2018-03-15 Thread Laurent Montel
mlaurent added a comment.


  Is it possible to create an autotest for this class ?

REPOSITORY
  R293 Baloo

REVISION DETAIL
  https://phabricator.kde.org/D11285

To: michaelh, #baloo, #frameworks, ngraham, mlaurent
Cc: mlaurent, ngraham, smithjd, ashaposhnikov, michaelh, spoorun, nicolasfella, 
alexeymin


D11285: Introduce sanitizer class

2018-03-14 Thread Michael Heidelbach
michaelh added inline comments.

INLINE COMMENTS

> databasesanitizer.cpp:200
> +
> +DocumentUrlDB DatabaseSanitizer::getDocuments(Transaction* txn)
> +{

This is the best I could think of to make `getDocuments` accessible to 
DatabaseSanitizerImpl. But I don't like it.
This method definitely should not be part of the interface.

REPOSITORY
  R293 Baloo

BRANCH
  sanitize-class (branched from master)

REVISION DETAIL
  https://phabricator.kde.org/D11285

To: michaelh, #baloo, #frameworks, ngraham
Cc: ngraham, smithjd, ashaposhnikov, michaelh, spoorun, nicolasfella, alexeymin


D11285: Introduce sanitizer class

2018-03-14 Thread Nathaniel Graham
ngraham accepted this revision.
ngraham added a comment.
This revision is now accepted and ready to land.


  Looks good to me!

REPOSITORY
  R293 Baloo

BRANCH
  sanitize-class (branched from master)

REVISION DETAIL
  https://phabricator.kde.org/D11285

To: michaelh, #baloo, #frameworks, ngraham
Cc: ngraham, smithjd, ashaposhnikov, michaelh, spoorun, nicolasfella, alexeymin


D11285: Introduce sanitizer class

2018-03-13 Thread Michael Heidelbach
michaelh added a dependent revision: D11287: Introduce baloodb CLI tool.

REPOSITORY
  R293 Baloo

REVISION DETAIL
  https://phabricator.kde.org/D11285

To: michaelh, #baloo, #frameworks
Cc: smithjd, ashaposhnikov, michaelh, spoorun, nicolasfella, alexeymin


D11285: Introduce sanitizer class

2018-03-13 Thread Michael Heidelbach
michaelh created this revision.
michaelh added reviewers: Baloo, Frameworks.
Restricted Application added projects: Frameworks, Baloo.
michaelh requested review of this revision.

REVISION SUMMARY
  Due to device ids being inconstant duplicates are introduced to the database. 
I. e. multiple document ids pointing to the same entity.
  This class shall eventually sanitize the database. Currently it just displays 
the issues should there be any.

REPOSITORY
  R293 Baloo

BRANCH
  sanitize-class (branched from master)

REVISION DETAIL
  https://phabricator.kde.org/D11285

AFFECTED FILES
  src/engine/CMakeLists.txt
  src/engine/databasesanitizer.cpp
  src/engine/databasesanitizer.h
  src/engine/documenturldb.cpp
  src/engine/transaction.h

To: michaelh, #baloo, #frameworks
Cc: smithjd, ashaposhnikov, michaelh, spoorun, nicolasfella, alexeymin