Hi, (evil top posting)
given the silence, I assume any interest in baloo has stopped once more, or? Or are there any plans how to fixup the current situation? Greetings Christoph ----- Am 7. Okt 2016 um 20:08 schrieb cullmann cullm...@absint.com: > Hi, > >> Hey >> >> On Fri, Oct 7, 2016 at 6:34 PM, Christoph Cullmann <cullm...@absint.com> >> wrote: >>> Hi, >>> >>>> On Fri, Oct 7, 2016 at 5:58 PM, Christoph Cullmann <cullm...@absint.com> >>>> wrote: >>>>> >>> >>> 1) No handling of DB errors beside asserting >>> 2) No handling of errors in the extractors (e.g. see the fixes I did, all >>> extractors will need more of that) >>> 3) No handling of NFS/large inodes/inconsistencies => crash >>> >>> In the end, in my opinion, you can rewrite close to all parts dealing with >>> the >>> DB or >>> any other thing internally. If ever any thing gots inconsistent, ATM you are >>> doomed, forever, >>> if not by luck my new startup code deletes the index, then you live again >>> until >>> it is reindexed. >>> >>>> >>> I am not sure, I am all for removing complete indexing and use a other >>> indexer >>> like tracker to exactly avoid the excurse into DB world and how to handle it >>> in a safe way with close to zero person manpower. >>> >> >> It's avoiding the problem and hoping for the best, without any experiments. > That is not true. > > I did experiments and search works with tracker, but yes, a problem is > tagging,+ > which ATM doesn't work. Nor do I say that is a ready solution now, just a > possibility > to avoid having to maintain low level code with at most 1 person (how it looks > ATM). > > And I don't propose to go that road now, but ATM I see nobody doing any other > experiments. > > Besides, tracker is constantly maintained and used since >> 5 years: > > https://github.com/GNOME/tracker/graphs/contributors > >> >>> >>> => That is good that we agree, but I find it very astonishing that we use >>> baloo >>> in its >>> current state more or less mandatory on all that systems were it by design >>> will >>> fail. >>> >>> (and it fails if you read the bugs) >>> >> >> There is a certain amount of failure, but it's not "by-design". But >> maybe I'm not seeing things clearly. > You yourself stated that neither 32-bit issues nor NFS nor > 32-bit inodes > have > any > error handling. And that seems to have been known even during design and still > we have this now as a framework per default used by any Plasma installation on > systems exactly featuring that without error checking. > >> >>>> >>>>>> >>>>>> How about requirements such as resource consumption, ease of >>>>>> integration, search speed are taken into consideration? Come on guys. >>>>>> We're engineers over here. >> >>>>> What is the argument here? If you take a look at bugs.kde.org, you see >>>>> that >>>>> people are complaining about all >>>>> of that with baloo. I see no evidence nowhere that e.g. baloo is >>>>> "superior" to >>>>> what GNOME uses >>>>> or any other solution (perhaps beside nepomuk, ok...). >> >> What tests have been to obtain the evidence? > What tests have been done to obtain the inverse evidence? I only hear here the > complaint > about not taking requirements like resource consumption or speed into account, > but > there is ATM zero evidence that e.g. tracker is slower. > > And yes, there are "it hogs" 100% memory or time bugs open, thought you can > hardly reproduce them > as people are somehow scared to pack their home and send it to us. Not that a > lot of that bugs > got touched at all in Bugzilla. > >> >>> >>>> >>>> Yup, you have. It's awesome. I no longer have the motivation to work on >>>> Baloo. >>> Thanks, but that makes me very sad, btw. >>> Baloo came up to replace nepomuk, which was dead because it had too many >>> bugs >>> and all maintainers left. >>> Now we have baloo, which has many bugs, some even by design, and the >>> maintainer >>> left, too. >>> >> >> Actually, Nepomuk was not dead. I was maintaining it. I killed it >> because it had too many structural problems. >> >> This is how the open source world works. People work on projects and >> when it no longer scratches their itch (I no longer use Baloo), they >> loose interest. This is "supposed" to be a hobby. > That is ok, to see it as hobby. > > But I am a bit unnerved that one proposes this as the generic index solution > for our desktop, which should be stable, if nothing else, and knows that it > has > severe > limitations that are not handled (see above). I would have assumed that at > least > the known "can't work here' > cases are handled in a graceful way. > > And given already one of the first things main.cpp of baloo_file does is: > > // HACK: Untill we start using lmdb with robust mutex support. We're just > going > to remove > // the lock manually in the baloo_file process. > QFile::remove(path + "/index-lock"); > > that doesn't leave high hopes, sorry. > > And the typical error check is: > > void MTimeDB::put(quint32 mtime, quint64 docId) > { > Q_ASSERT(mtime > 0); > Q_ASSERT(docId > 0); > > MDB_val key; > key.mv_size = sizeof(quint32); > key.mv_data = static_cast<void*>(&mtime); > > MDB_val val; > val.mv_size = sizeof(quint64); > val.mv_data = static_cast<void*>(&docId); > > int rc = mdb_put(m_txn, m_dbi, &key, &val, 0); > Q_ASSERT_X(rc == 0, "MTimeDB::put", mdb_strerror(rc)); > } > > without any way to pass an error to the outside, nor any error handling code > at > the outside, > as no error can ever occur that is non-fatal. > >> >>> >>>> (This is why they run on a separate process) >>> That doesn't help, it just OOMs your system => dead, it needs resource >>> restrictions, >>> which is tricky to get right. >>> >> >> You're right. It needs a better thought out solution. A separate >> process is the bare minimum. >> >> Btw, have you looked if Tracker actually does any of this? > It has process separation and it handles crashs well enough to not screw up > client process queries. And it has maintained extractors or miners, unlike us. > But for sure, it has bugs and crashs and all things, but it is maintained and > has a > constant stream of fixes for a longer time than baloo + all predecessors > together. > >> >>>> My hostility was because the proposal ignores key points such as - >>>> >>>> * Indexing Speed >>>> * Search speed >>>> * Database size >>> => If you look at the bugs, people complain we are inferior and I see not >>> that the proposal ignores it, I just see not how to compare, given there >>> are no >>> hard facts that we are faster than e.g. tracker in any way. >>> >> >> Data can be gathered about it. Not all data is publicly available. > That would make any decision easier to take. > >> >>>> * Ease of use with our existing components >>> My proposal did not change the interface at all, it has zero impact on >>> "ease of >>> use". >>> >>>> * Ease of fixing problems in the code >>> My estimate would be: rewrite close to everything. Even the basic 64-bit >>> int id >>> won't work >>> with 64-bit inodes, each DB call must be touched to check for errors, at >>> each >>> place >>> one will need to check for potential inconsistencies and exit gracefully... >>> >> >> I don't follow why everything needs to be re-written? Am I missing >> something or do we just need to check for more errors and use a higher >> integer id? This certainly doesn't seem super trivial, but it sounds >> like less work than implementing a shim on top of Tracker. > If you look at your own code, you will see, that there is no error handling at > all, > beside asserts. (see above) > > There is not even the concept of pass an error out to higher levels. > > Perhaps I am wrong, because there is only a bit of documentation in addition, > but if you start to add error handling at the DB calls, you can start to > rewrite > all internal layers. > > Besides I don't see any documentation of the DB format, but I could miss that. > (at least not in the git nor https://community.kde.org/Baloo) > >> >> I could be wrong. > So coulbe be me ;=) > >> >>>> >>>> Baloo has certain speed requirements if it is to be used with krunner, >>>> and we want instant feedback. This was an integral requirement. >>> I doubt e.g. tracker has different requirements, as it is used in similar >>> places >>> by GNOME. >>> >>> But all that left besides, have you an proposal how to fixup the current >>> situation? >>> Are you willing to invest some work to fix the current issues or an idea >>> what >>> would be a good way to tackle them? >>> >> >> I probably will not work more in Baloo. >> >> I'll have to investigate the problems a bit more. From the cursory >> look of this thread, it doesn't seem that the problems are that dire. >> But I may not be reading into it correctly. > What would be highly appreciated would be a bit of documentation what the > different pieces do and stuff like that, even if you have no time to code. > > Greetings > Christoph > > -- > ----------------------------- Dr.-Ing. Christoph Cullmann --------- > AbsInt Angewandte Informatik GmbH Email: cullm...@absint.com > Science Park 1 Tel: +49-681-38360-22 > 66123 Saarbrücken Fax: +49-681-38360-20 > GERMANY WWW: http://www.AbsInt.com > -------------------------------------------------------------------- > Geschäftsführung: Dr.-Ing. Christian Ferdinand > Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234 -- ----------------------------- Dr.-Ing. Christoph Cullmann --------- AbsInt Angewandte Informatik GmbH Email: cullm...@absint.com Science Park 1 Tel: +49-681-38360-22 66123 Saarbrücken Fax: +49-681-38360-20 GERMANY WWW: http://www.AbsInt.com -------------------------------------------------------------------- Geschäftsführung: Dr.-Ing. Christian Ferdinand Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234