Hi, Unfortunately I've been hit my multiple pretty severe health scares in the last month, and have no idea when I'm going to be at 100% again.
For the time being I'd rather not hold up any development, so don't hold back anything on my account. -- Boudhayan On 16 October 2016 at 17:46, Christoph Cullmann <cullm...@absint.com> wrote: > Hi, > > (evil top posting) > > given the silence, I assume any interest in baloo has stopped once more, > or? > Or are there any plans how to fixup the current situation? > > Greetings > Christoph > > ----- Am 7. Okt 2016 um 20:08 schrieb cullmann cullm...@absint.com: > > > Hi, > > > >> Hey > >> > >> On Fri, Oct 7, 2016 at 6:34 PM, Christoph Cullmann <cullm...@absint.com> > wrote: > >>> Hi, > >>> > >>>> On Fri, Oct 7, 2016 at 5:58 PM, Christoph Cullmann < > cullm...@absint.com> wrote: > >>>>> > >>> > >>> 1) No handling of DB errors beside asserting > >>> 2) No handling of errors in the extractors (e.g. see the fixes I did, > all > >>> extractors will need more of that) > >>> 3) No handling of NFS/large inodes/inconsistencies => crash > >>> > >>> In the end, in my opinion, you can rewrite close to all parts dealing > with the > >>> DB or > >>> any other thing internally. If ever any thing gots inconsistent, ATM > you are > >>> doomed, forever, > >>> if not by luck my new startup code deletes the index, then you live > again until > >>> it is reindexed. > >>> > >>>> > >>> I am not sure, I am all for removing complete indexing and use a other > indexer > >>> like tracker to exactly avoid the excurse into DB world and how to > handle it > >>> in a safe way with close to zero person manpower. > >>> > >> > >> It's avoiding the problem and hoping for the best, without any > experiments. > > That is not true. > > > > I did experiments and search works with tracker, but yes, a problem is > tagging,+ > > which ATM doesn't work. Nor do I say that is a ready solution now, just a > > possibility > > to avoid having to maintain low level code with at most 1 person (how it > looks > > ATM). > > > > And I don't propose to go that road now, but ATM I see nobody doing any > other > > experiments. > > > > Besides, tracker is constantly maintained and used since >> 5 years: > > > > https://github.com/GNOME/tracker/graphs/contributors > > > >> > >>> > >>> => That is good that we agree, but I find it very astonishing that we > use baloo > >>> in its > >>> current state more or less mandatory on all that systems were it by > design will > >>> fail. > >>> > >>> (and it fails if you read the bugs) > >>> > >> > >> There is a certain amount of failure, but it's not "by-design". But > >> maybe I'm not seeing things clearly. > > You yourself stated that neither 32-bit issues nor NFS nor > 32-bit > inodes have > > any > > error handling. And that seems to have been known even during design and > still > > we have this now as a framework per default used by any Plasma > installation on > > systems exactly featuring that without error checking. > > > >> > >>>> > >>>>>> > >>>>>> How about requirements such as resource consumption, ease of > >>>>>> integration, search speed are taken into consideration? Come on > guys. > >>>>>> We're engineers over here. > >> > >>>>> What is the argument here? If you take a look at bugs.kde.org, you > see that > >>>>> people are complaining about all > >>>>> of that with baloo. I see no evidence nowhere that e.g. baloo is > "superior" to > >>>>> what GNOME uses > >>>>> or any other solution (perhaps beside nepomuk, ok...). > >> > >> What tests have been to obtain the evidence? > > What tests have been done to obtain the inverse evidence? I only hear > here the > > complaint > > about not taking requirements like resource consumption or speed into > account, > > but > > there is ATM zero evidence that e.g. tracker is slower. > > > > And yes, there are "it hogs" 100% memory or time bugs open, thought you > can > > hardly reproduce them > > as people are somehow scared to pack their home and send it to us. Not > that a > > lot of that bugs > > got touched at all in Bugzilla. > > > >> > >>> > >>>> > >>>> Yup, you have. It's awesome. I no longer have the motivation to work > on Baloo. > >>> Thanks, but that makes me very sad, btw. > >>> Baloo came up to replace nepomuk, which was dead because it had too > many bugs > >>> and all maintainers left. > >>> Now we have baloo, which has many bugs, some even by design, and the > maintainer > >>> left, too. > >>> > >> > >> Actually, Nepomuk was not dead. I was maintaining it. I killed it > >> because it had too many structural problems. > >> > >> This is how the open source world works. People work on projects and > >> when it no longer scratches their itch (I no longer use Baloo), they > >> loose interest. This is "supposed" to be a hobby. > > That is ok, to see it as hobby. > > > > But I am a bit unnerved that one proposes this as the generic index > solution > > for our desktop, which should be stable, if nothing else, and knows that > it has > > severe > > limitations that are not handled (see above). I would have assumed that > at least > > the known "can't work here' > > cases are handled in a graceful way. > > > > And given already one of the first things main.cpp of baloo_file does is: > > > > // HACK: Untill we start using lmdb with robust mutex support. We're > just going > > to remove > > // the lock manually in the baloo_file process. > > QFile::remove(path + "/index-lock"); > > > > that doesn't leave high hopes, sorry. > > > > And the typical error check is: > > > > void MTimeDB::put(quint32 mtime, quint64 docId) > > { > > Q_ASSERT(mtime > 0); > > Q_ASSERT(docId > 0); > > > > MDB_val key; > > key.mv_size = sizeof(quint32); > > key.mv_data = static_cast<void*>(&mtime); > > > > MDB_val val; > > val.mv_size = sizeof(quint64); > > val.mv_data = static_cast<void*>(&docId); > > > > int rc = mdb_put(m_txn, m_dbi, &key, &val, 0); > > Q_ASSERT_X(rc == 0, "MTimeDB::put", mdb_strerror(rc)); > > } > > > > without any way to pass an error to the outside, nor any error handling > code at > > the outside, > > as no error can ever occur that is non-fatal. > > > >> > >>> > >>>> (This is why they run on a separate process) > >>> That doesn't help, it just OOMs your system => dead, it needs resource > >>> restrictions, > >>> which is tricky to get right. > >>> > >> > >> You're right. It needs a better thought out solution. A separate > >> process is the bare minimum. > >> > >> Btw, have you looked if Tracker actually does any of this? > > It has process separation and it handles crashs well enough to not screw > up > > client process queries. And it has maintained extractors or miners, > unlike us. > > But for sure, it has bugs and crashs and all things, but it is > maintained and > > has a > > constant stream of fixes for a longer time than baloo + all predecessors > > together. > > > >> > >>>> My hostility was because the proposal ignores key points such as - > >>>> > >>>> * Indexing Speed > >>>> * Search speed > >>>> * Database size > >>> => If you look at the bugs, people complain we are inferior and I see > not > >>> that the proposal ignores it, I just see not how to compare, given > there are no > >>> hard facts that we are faster than e.g. tracker in any way. > >>> > >> > >> Data can be gathered about it. Not all data is publicly available. > > That would make any decision easier to take. > > > >> > >>>> * Ease of use with our existing components > >>> My proposal did not change the interface at all, it has zero impact on > "ease of > >>> use". > >>> > >>>> * Ease of fixing problems in the code > >>> My estimate would be: rewrite close to everything. Even the basic > 64-bit int id > >>> won't work > >>> with 64-bit inodes, each DB call must be touched to check for errors, > at each > >>> place > >>> one will need to check for potential inconsistencies and exit > gracefully... > >>> > >> > >> I don't follow why everything needs to be re-written? Am I missing > >> something or do we just need to check for more errors and use a higher > >> integer id? This certainly doesn't seem super trivial, but it sounds > >> like less work than implementing a shim on top of Tracker. > > If you look at your own code, you will see, that there is no error > handling at > > all, > > beside asserts. (see above) > > > > There is not even the concept of pass an error out to higher levels. > > > > Perhaps I am wrong, because there is only a bit of documentation in > addition, > > but if you start to add error handling at the DB calls, you can start to > rewrite > > all internal layers. > > > > Besides I don't see any documentation of the DB format, but I could miss > that. > > (at least not in the git nor https://community.kde.org/Baloo) > > > >> > >> I could be wrong. > > So coulbe be me ;=) > > > >> > >>>> > >>>> Baloo has certain speed requirements if it is to be used with krunner, > >>>> and we want instant feedback. This was an integral requirement. > >>> I doubt e.g. tracker has different requirements, as it is used in > similar places > >>> by GNOME. > >>> > >>> But all that left besides, have you an proposal how to fixup the > current > >>> situation? > >>> Are you willing to invest some work to fix the current issues or an > idea what > >>> would be a good way to tackle them? > >>> > >> > >> I probably will not work more in Baloo. > >> > >> I'll have to investigate the problems a bit more. From the cursory > >> look of this thread, it doesn't seem that the problems are that dire. > >> But I may not be reading into it correctly. > > What would be highly appreciated would be a bit of documentation what the > > different pieces do and stuff like that, even if you have no time to > code. > > > > Greetings > > Christoph > > > > -- > > ----------------------------- Dr.-Ing. Christoph Cullmann --------- > > AbsInt Angewandte Informatik GmbH Email: cullm...@absint.com > > Science Park 1 Tel: +49-681-38360-22 > > 66123 Saarbrücken Fax: +49-681-38360-20 > > GERMANY WWW: http://www.AbsInt.com > > -------------------------------------------------------------------- > > Geschäftsführung: Dr.-Ing. Christian Ferdinand > > Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234 > > -- > ----------------------------- Dr.-Ing. Christoph Cullmann --------- > AbsInt Angewandte Informatik GmbH Email: cullm...@absint.com > Science Park 1 Tel: +49-681-38360-22 > 66123 Saarbrücken Fax: +49-681-38360-20 > GERMANY WWW: http://www.AbsInt.com > -------------------------------------------------------------------- > Geschäftsführung: Dr.-Ing. Christian Ferdinand > Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234 >