Re: Scrap Baloo Thread Feedback
On Tue, Mar 28, 2017 at 5:21 AM, Matthieu Gallien < gallien.matth...@gmail.com> wrote: > Hello all, > > Sorry to exhume this old thread, but > > Is there a common agreement on the best path forward for Baloo versus > the current situation ? > > I have an interest in having a global KDE solution where I would help > (as time allows). Still, I will only work after an agreement has been > reached. > Hey, thanks for stepping up! I'd say that if nobody gives you any sort of approval, just go ahead and figure out the best plan for Baloo and then present it here, saying that you would like Baloo to move in this direction, detail out your proposal and if nobody objects, go ahead and do it. KDE is about doers, after all :) Also, if you're still a student, you may want to consider doing this as a GSoC. Cheers -- Martin Klapetek
Re: Scrap Baloo Thread Feedback
Hello all, Sorry to exhume this old thread, but 2016-12-29 13:47 GMT+01:00 Dominik Haumann : > Hi all, > > CC: plasma-devel, due to stability issues > > On Fri, Oct 7, 2016 at 5:56 PM, Christoph Cullmann > wrote: >> Hi, >> > [...] >> Actually, the bugs.kde.org page tells you the facts: The bug number >> was constant increasing since > 1 year. The thread lists some other facts >> what is wrong ATM and should be fixed. > > Btw, the bug count is increasing again, just as before. So it seems > problems remain. > > [...] > >>> Right now, random requirements such as NFS and 32bit systems are >>> coming up. Are these really that important? > > Yes, it is, see below. > >>> I specifically designed >>> Baloo to not care about both network mounts and 32-bit systems. Yes, >>> Baloo has bugs and it won't handle more than 32bit-inodes. These >>> things, as all others, can be fixed. It's really a question of what is >>> important. Lets not target the outliers. Many of these decisions were >>> deliberately taken. >> That are no random requirements, sorry, you could call it random >> restrictions, too. >> That is not that productive, or? >> >> 1) 32-bit systems are still there and if that is a design decision to NOT >> support them, >> that is ok, but then bad for Plasma, no official support for 32-bit systems, >> baloo is IMHO >> the only framework with such requirements. And I see not that we have hinted >> any distro >> that they shall not compile it for 32-bit. >> >> 2) No NFS: Ok, fair game, but then, it should check that and disable itself >> completely if $HOME >> where the db is stored is a NFS, can live with that, too, but not with the >> current "we random >> crash" behavior. => That is a user experience we don't want, or? > > The reason why I am writing this mail is exactly this point: > > At the university where I was previously working, $HOME is mounted via > NFS. After an upgrade from KDE4 to Plasma 5.8, the desktop crashes > very often. And the very reason is baloo. > > The problem, however, is that even the sysadmins do not know that > baloo is the reason for all the crashes. In other words: Hundreds of > users probably get the impression of an unstable Plasma 5.8 - or even > worse - it boils down to "KDE sucks" or "I don't have these issues > with Ubuntu". > > This is a perfect example of extremely negative impact - the Plasma > devs can work as hard as they want, the desktop in this context will > *never* be stable unless baloo is deactivated. > > That said: Baloo needs to disable itself for everything that touches > NFS, or maybe even disable itself after it crashes several times. > > There were many more issues listed and discussed, but as far as I can > see, we did not make real advances besides some prototype based on > tracker (just a test), and some minor fixes in baloo that do not > address the hard problems. > > Sorry that this reads like a rant. This is not the intention. Instead, > the goal is to underline the still severe issues in order to get > closer to a stable desktop for our users. > > Greetings > Dominik > > > >> 3) > 32-bit inodes: That is normal and should work, but even if it should >> not: Atm you get inconsistent >> and then later assertion fails or crashs. >> >> => I can live with all restrictions but the current handling of them, that >> always ends in "crash" is >> IMHO not that acceptable. But that is "my" opinion, that might vary in the >> eyes of others. >> >>> >>> How about requirements such as resource consumption, ease of >>> integration, search speed are taken into consideration? Come on guys. >>> We're engineers over here. >> What is the argument here? If you take a look at bugs.kde.org, you see that >> people are complaining about all >> of that with baloo. I see no evidence nowhere that e.g. baloo is "superior" >> to what GNOME uses >> or any other solution (perhaps beside nepomuk, ok...). >> >> I fixed in a few days more bugs than were fixed in 1 year and triaged more >> than ever, still a lot is to be done. >> (and I did really not do a lot, just remove things like 'self destruct if >> index > 5GB' or 'crash for ever on >> db corruption') >> >> A graph tells more than words: >> >> https://bugs.kde.org/reports.cgi?product=frameworks-baloo&output=show_chart&datasets=CONFIRMED&datasets=ASSIGNED&datasets=REOPENED&datasets=UNCONFIRMED&datasets=RESOLVED&banner=1 >> >> Given the current open bugs, one will need to: >> >> 1) review all extractors, they have still close to zero error handling and >> will just crash or OOM you on bad files >> 2) review + fix the complete data base handling to handle errors and perhaps >> swap the DB >> 3) fix the indexer to have some resource limits to avoid OOM and Co. if e..g >> extractors fail >> ... >> >> Therefore there was my proposal, given we lack manpower, to implement baloo >> API on top of e.g. tracker to avoid all this >> and let tracker handle that. >> >> To check if that is at all feasible, I did some quick an
Re: Scrap Baloo Thread Feedback
Hi all, CC: plasma-devel, due to stability issues On Fri, Oct 7, 2016 at 5:56 PM, Christoph Cullmann wrote: > Hi, > [...] > Actually, the bugs.kde.org page tells you the facts: The bug number > was constant increasing since > 1 year. The thread lists some other facts > what is wrong ATM and should be fixed. Btw, the bug count is increasing again, just as before. So it seems problems remain. [...] >> Right now, random requirements such as NFS and 32bit systems are >> coming up. Are these really that important? Yes, it is, see below. >> I specifically designed >> Baloo to not care about both network mounts and 32-bit systems. Yes, >> Baloo has bugs and it won't handle more than 32bit-inodes. These >> things, as all others, can be fixed. It's really a question of what is >> important. Lets not target the outliers. Many of these decisions were >> deliberately taken. > That are no random requirements, sorry, you could call it random > restrictions, too. > That is not that productive, or? > > 1) 32-bit systems are still there and if that is a design decision to NOT > support them, > that is ok, but then bad for Plasma, no official support for 32-bit systems, > baloo is IMHO > the only framework with such requirements. And I see not that we have hinted > any distro > that they shall not compile it for 32-bit. > > 2) No NFS: Ok, fair game, but then, it should check that and disable itself > completely if $HOME > where the db is stored is a NFS, can live with that, too, but not with the > current "we random > crash" behavior. => That is a user experience we don't want, or? The reason why I am writing this mail is exactly this point: At the university where I was previously working, $HOME is mounted via NFS. After an upgrade from KDE4 to Plasma 5.8, the desktop crashes very often. And the very reason is baloo. The problem, however, is that even the sysadmins do not know that baloo is the reason for all the crashes. In other words: Hundreds of users probably get the impression of an unstable Plasma 5.8 - or even worse - it boils down to "KDE sucks" or "I don't have these issues with Ubuntu". This is a perfect example of extremely negative impact - the Plasma devs can work as hard as they want, the desktop in this context will *never* be stable unless baloo is deactivated. That said: Baloo needs to disable itself for everything that touches NFS, or maybe even disable itself after it crashes several times. There were many more issues listed and discussed, but as far as I can see, we did not make real advances besides some prototype based on tracker (just a test), and some minor fixes in baloo that do not address the hard problems. Sorry that this reads like a rant. This is not the intention. Instead, the goal is to underline the still severe issues in order to get closer to a stable desktop for our users. Greetings Dominik > 3) > 32-bit inodes: That is normal and should work, but even if it should > not: Atm you get inconsistent > and then later assertion fails or crashs. > > => I can live with all restrictions but the current handling of them, that > always ends in "crash" is > IMHO not that acceptable. But that is "my" opinion, that might vary in the > eyes of others. > >> >> How about requirements such as resource consumption, ease of >> integration, search speed are taken into consideration? Come on guys. >> We're engineers over here. > What is the argument here? If you take a look at bugs.kde.org, you see that > people are complaining about all > of that with baloo. I see no evidence nowhere that e.g. baloo is "superior" > to what GNOME uses > or any other solution (perhaps beside nepomuk, ok...). > > I fixed in a few days more bugs than were fixed in 1 year and triaged more > than ever, still a lot is to be done. > (and I did really not do a lot, just remove things like 'self destruct if > index > 5GB' or 'crash for ever on > db corruption') > > A graph tells more than words: > > https://bugs.kde.org/reports.cgi?product=frameworks-baloo&output=show_chart&datasets=CONFIRMED&datasets=ASSIGNED&datasets=REOPENED&datasets=UNCONFIRMED&datasets=RESOLVED&banner=1 > > Given the current open bugs, one will need to: > > 1) review all extractors, they have still close to zero error handling and > will just crash or OOM you on bad files > 2) review + fix the complete data base handling to handle errors and perhaps > swap the DB > 3) fix the indexer to have some resource limits to avoid OOM and Co. if e..g > extractors fail > ... > > Therefore there was my proposal, given we lack manpower, to implement baloo > API on top of e.g. tracker to avoid all this > and let tracker handle that. > > To check if that is at all feasible, I did some quick and dirty > implementation (still modulo filling of the metadata in the results + tagging, > which is a problem, but that was only to see if e.g. search works) > > https://quickgit.kde.org/?p=clones%2Fbaloo%2Fcullmann%2Ftbaloo.git > > That is just a pro
Re: Scrap Baloo Thread Feedback
On Sun, Oct 16, 2016 at 2:16 PM, Christoph Cullmann wrote: > Hi, > > (evil top posting) > > given the silence, I assume any interest in baloo has stopped once more, or? > Or are there any plans how to fixup the current situation? I'm not going to be very involved with this. I've already expressed my opinion. You all are free to make a decision. -- Vishesh Handa
Re: Scrap Baloo Thread Feedback
On Fri, Oct 7, 2016 at 8:08 PM, Christoph Cullmann wrote: > > I did experiments and search works with tracker, but yes, a problem is > tagging,+ > which ATM doesn't work. Nor do I say that is a ready solution now, just a > possibility > to avoid having to maintain low level code with at most 1 person (how it > looks ATM). > > And I don't propose to go that road now, but ATM I see nobody doing any other > experiments. > > Besides, tracker is constantly maintained and used since >> 5 years: > > https://github.com/GNOME/tracker/graphs/contributors ok. Baloo clearly isn't being maintained. > >> >>> >>> => That is good that we agree, but I find it very astonishing that we use >>> baloo >>> in its >>> current state more or less mandatory on all that systems were it by design >>> will >>> fail. >>> >>> (and it fails if you read the bugs) >>> >> >> There is a certain amount of failure, but it's not "by-design". But >> maybe I'm not seeing things clearly. > You yourself stated that neither 32-bit issues nor NFS nor > 32-bit inodes > have any > error handling. And that seems to have been known even during design and still > we have this now as a framework per default used by any Plasma installation on > systems exactly featuring that without error checking. > >> >> >> How about requirements such as resource consumption, ease of >> integration, search speed are taken into consideration? Come on guys. >> We're engineers over here. >> > What is the argument here? If you take a look at bugs.kde.org, you see > that > people are complaining about all > of that with baloo. I see no evidence nowhere that e.g. baloo is > "superior" to > what GNOME uses > or any other solution (perhaps beside nepomuk, ok...). >> >> What tests have been to obtain the evidence? > What tests have been done to obtain the inverse evidence? I only hear here > the complaint > about not taking requirements like resource consumption or speed into > account, but > there is ATM zero evidence that e.g. tracker is slower. > I did do a lot of tests during the design of Baloo. I don't have hard numbers. Even if I didn't, that doesn't mean a decision should be made without gathering proper data. > And the typical error check is: > > void MTimeDB::put(quint32 mtime, quint64 docId) > { > Q_ASSERT(mtime > 0); > Q_ASSERT(docId > 0); > > MDB_val key; > key.mv_size = sizeof(quint32); > key.mv_data = static_cast(&mtime); > > MDB_val val; > val.mv_size = sizeof(quint64); > val.mv_data = static_cast(&docId); > > int rc = mdb_put(m_txn, m_dbi, &key, &val, 0); > Q_ASSERT_X(rc == 0, "MTimeDB::put", mdb_strerror(rc)); > } > > without any way to pass an error to the outside, nor any error handling code > at the outside, > as no error can ever occur that is non-fatal. > ok. The API isn't exported. It can be changed. But we both seem to have different opinions of how much work this would be. > > Besides I don't see any documentation of the DB format, but I could miss that. > (at least not in the git nor https://community.kde.org/Baloo) > There isn't any. > > What would be highly appreciated would be a bit of documentation what the > different pieces do and stuff like that, even if you have no time to code. > If you can send me specific questions about different parts I can answer them. For "general documentation", I don't know where to start. I usually much prefer just going through the code. > Greetings > Christoph >
Re: Scrap Baloo Thread Feedback
Hi, Unfortunately I've been hit my multiple pretty severe health scares in the last month, and have no idea when I'm going to be at 100% again. For the time being I'd rather not hold up any development, so don't hold back anything on my account. -- Boudhayan On 16 October 2016 at 17:46, Christoph Cullmann wrote: > Hi, > > (evil top posting) > > given the silence, I assume any interest in baloo has stopped once more, > or? > Or are there any plans how to fixup the current situation? > > Greetings > Christoph > > - Am 7. Okt 2016 um 20:08 schrieb cullmann cullm...@absint.com: > > > Hi, > > > >> Hey > >> > >> On Fri, Oct 7, 2016 at 6:34 PM, Christoph Cullmann > wrote: > >>> Hi, > >>> > On Fri, Oct 7, 2016 at 5:58 PM, Christoph Cullmann < > cullm...@absint.com> wrote: > > > >>> > >>> 1) No handling of DB errors beside asserting > >>> 2) No handling of errors in the extractors (e.g. see the fixes I did, > all > >>> extractors will need more of that) > >>> 3) No handling of NFS/large inodes/inconsistencies => crash > >>> > >>> In the end, in my opinion, you can rewrite close to all parts dealing > with the > >>> DB or > >>> any other thing internally. If ever any thing gots inconsistent, ATM > you are > >>> doomed, forever, > >>> if not by luck my new startup code deletes the index, then you live > again until > >>> it is reindexed. > >>> > > >>> I am not sure, I am all for removing complete indexing and use a other > indexer > >>> like tracker to exactly avoid the excurse into DB world and how to > handle it > >>> in a safe way with close to zero person manpower. > >>> > >> > >> It's avoiding the problem and hoping for the best, without any > experiments. > > That is not true. > > > > I did experiments and search works with tracker, but yes, a problem is > tagging,+ > > which ATM doesn't work. Nor do I say that is a ready solution now, just a > > possibility > > to avoid having to maintain low level code with at most 1 person (how it > looks > > ATM). > > > > And I don't propose to go that road now, but ATM I see nobody doing any > other > > experiments. > > > > Besides, tracker is constantly maintained and used since >> 5 years: > > > > https://github.com/GNOME/tracker/graphs/contributors > > > >> > >>> > >>> => That is good that we agree, but I find it very astonishing that we > use baloo > >>> in its > >>> current state more or less mandatory on all that systems were it by > design will > >>> fail. > >>> > >>> (and it fails if you read the bugs) > >>> > >> > >> There is a certain amount of failure, but it's not "by-design". But > >> maybe I'm not seeing things clearly. > > You yourself stated that neither 32-bit issues nor NFS nor > 32-bit > inodes have > > any > > error handling. And that seems to have been known even during design and > still > > we have this now as a framework per default used by any Plasma > installation on > > systems exactly featuring that without error checking. > > > >> > > >> > >> How about requirements such as resource consumption, ease of > >> integration, search speed are taken into consideration? Come on > guys. > >> We're engineers over here. > >> > > What is the argument here? If you take a look at bugs.kde.org, you > see that > > people are complaining about all > > of that with baloo. I see no evidence nowhere that e.g. baloo is > "superior" to > > what GNOME uses > > or any other solution (perhaps beside nepomuk, ok...). > >> > >> What tests have been to obtain the evidence? > > What tests have been done to obtain the inverse evidence? I only hear > here the > > complaint > > about not taking requirements like resource consumption or speed into > account, > > but > > there is ATM zero evidence that e.g. tracker is slower. > > > > And yes, there are "it hogs" 100% memory or time bugs open, thought you > can > > hardly reproduce them > > as people are somehow scared to pack their home and send it to us. Not > that a > > lot of that bugs > > got touched at all in Bugzilla. > > > >> > >>> > > Yup, you have. It's awesome. I no longer have the motivation to work > on Baloo. > >>> Thanks, but that makes me very sad, btw. > >>> Baloo came up to replace nepomuk, which was dead because it had too > many bugs > >>> and all maintainers left. > >>> Now we have baloo, which has many bugs, some even by design, and the > maintainer > >>> left, too. > >>> > >> > >> Actually, Nepomuk was not dead. I was maintaining it. I killed it > >> because it had too many structural problems. > >> > >> This is how the open source world works. People work on projects and > >> when it no longer scratches their itch (I no longer use Baloo), they > >> loose interest. This is "supposed" to be a hobby. > > That is ok, to see it as hobby. > > > > But I am a bit unnerved that one proposes this as the generic index > solution > > for our desktop, which should be stable, if nothing else, and knows that > it has > > severe > > limitations t
Re: Scrap Baloo Thread Feedback
Hi, (evil top posting) given the silence, I assume any interest in baloo has stopped once more, or? Or are there any plans how to fixup the current situation? Greetings Christoph - Am 7. Okt 2016 um 20:08 schrieb cullmann cullm...@absint.com: > Hi, > >> Hey >> >> On Fri, Oct 7, 2016 at 6:34 PM, Christoph Cullmann >> wrote: >>> Hi, >>> On Fri, Oct 7, 2016 at 5:58 PM, Christoph Cullmann wrote: > >>> >>> 1) No handling of DB errors beside asserting >>> 2) No handling of errors in the extractors (e.g. see the fixes I did, all >>> extractors will need more of that) >>> 3) No handling of NFS/large inodes/inconsistencies => crash >>> >>> In the end, in my opinion, you can rewrite close to all parts dealing with >>> the >>> DB or >>> any other thing internally. If ever any thing gots inconsistent, ATM you are >>> doomed, forever, >>> if not by luck my new startup code deletes the index, then you live again >>> until >>> it is reindexed. >>> >>> I am not sure, I am all for removing complete indexing and use a other >>> indexer >>> like tracker to exactly avoid the excurse into DB world and how to handle it >>> in a safe way with close to zero person manpower. >>> >> >> It's avoiding the problem and hoping for the best, without any experiments. > That is not true. > > I did experiments and search works with tracker, but yes, a problem is > tagging,+ > which ATM doesn't work. Nor do I say that is a ready solution now, just a > possibility > to avoid having to maintain low level code with at most 1 person (how it looks > ATM). > > And I don't propose to go that road now, but ATM I see nobody doing any other > experiments. > > Besides, tracker is constantly maintained and used since >> 5 years: > > https://github.com/GNOME/tracker/graphs/contributors > >> >>> >>> => That is good that we agree, but I find it very astonishing that we use >>> baloo >>> in its >>> current state more or less mandatory on all that systems were it by design >>> will >>> fail. >>> >>> (and it fails if you read the bugs) >>> >> >> There is a certain amount of failure, but it's not "by-design". But >> maybe I'm not seeing things clearly. > You yourself stated that neither 32-bit issues nor NFS nor > 32-bit inodes > have > any > error handling. And that seems to have been known even during design and still > we have this now as a framework per default used by any Plasma installation on > systems exactly featuring that without error checking. > >> >> >> How about requirements such as resource consumption, ease of >> integration, search speed are taken into consideration? Come on guys. >> We're engineers over here. >> > What is the argument here? If you take a look at bugs.kde.org, you see > that > people are complaining about all > of that with baloo. I see no evidence nowhere that e.g. baloo is > "superior" to > what GNOME uses > or any other solution (perhaps beside nepomuk, ok...). >> >> What tests have been to obtain the evidence? > What tests have been done to obtain the inverse evidence? I only hear here the > complaint > about not taking requirements like resource consumption or speed into account, > but > there is ATM zero evidence that e.g. tracker is slower. > > And yes, there are "it hogs" 100% memory or time bugs open, thought you can > hardly reproduce them > as people are somehow scared to pack their home and send it to us. Not that a > lot of that bugs > got touched at all in Bugzilla. > >> >>> Yup, you have. It's awesome. I no longer have the motivation to work on Baloo. >>> Thanks, but that makes me very sad, btw. >>> Baloo came up to replace nepomuk, which was dead because it had too many >>> bugs >>> and all maintainers left. >>> Now we have baloo, which has many bugs, some even by design, and the >>> maintainer >>> left, too. >>> >> >> Actually, Nepomuk was not dead. I was maintaining it. I killed it >> because it had too many structural problems. >> >> This is how the open source world works. People work on projects and >> when it no longer scratches their itch (I no longer use Baloo), they >> loose interest. This is "supposed" to be a hobby. > That is ok, to see it as hobby. > > But I am a bit unnerved that one proposes this as the generic index solution > for our desktop, which should be stable, if nothing else, and knows that it > has > severe > limitations that are not handled (see above). I would have assumed that at > least > the known "can't work here' > cases are handled in a graceful way. > > And given already one of the first things main.cpp of baloo_file does is: > >// HACK: Untill we start using lmdb with robust mutex support. We're just > going >to remove >// the lock manually in the baloo_file process. >QFile::remove(path + "/index-lock"); > > that doesn't leave high hopes, sorry. > > And the typical error check is: > > void MTimeDB::put(quint32 mtime, quint
Re: Scrap Baloo Thread Feedback
Hey On Fri, Oct 7, 2016 at 6:34 PM, Christoph Cullmann wrote: > Hi, > >> On Fri, Oct 7, 2016 at 5:58 PM, Christoph Cullmann >> wrote: >>> > > 1) No handling of DB errors beside asserting > 2) No handling of errors in the extractors (e.g. see the fixes I did, all > extractors will need more of that) > 3) No handling of NFS/large inodes/inconsistencies => crash > > In the end, in my opinion, you can rewrite close to all parts dealing with > the DB or > any other thing internally. If ever any thing gots inconsistent, ATM you are > doomed, forever, > if not by luck my new startup code deletes the index, then you live again > until it is reindexed. > >> > I am not sure, I am all for removing complete indexing and use a other indexer > like tracker to exactly avoid the excurse into DB world and how to handle it > in a safe way with close to zero person manpower. > It's avoiding the problem and hoping for the best, without any experiments. > > => That is good that we agree, but I find it very astonishing that we use > baloo in its > current state more or less mandatory on all that systems were it by design > will fail. > > (and it fails if you read the bugs) > There is a certain amount of failure, but it's not "by-design". But maybe I'm not seeing things clearly. >> How about requirements such as resource consumption, ease of integration, search speed are taken into consideration? Come on guys. We're engineers over here. >>> What is the argument here? If you take a look at bugs.kde.org, you see that >>> people are complaining about all >>> of that with baloo. I see no evidence nowhere that e.g. baloo is "superior" >>> to >>> what GNOME uses >>> or any other solution (perhaps beside nepomuk, ok...). What tests have been to obtain the evidence? > >> >> Yup, you have. It's awesome. I no longer have the motivation to work on >> Baloo. > Thanks, but that makes me very sad, btw. > Baloo came up to replace nepomuk, which was dead because it had too many bugs > and all maintainers left. > Now we have baloo, which has many bugs, some even by design, and the > maintainer left, too. > Actually, Nepomuk was not dead. I was maintaining it. I killed it because it had too many structural problems. This is how the open source world works. People work on projects and when it no longer scratches their itch (I no longer use Baloo), they loose interest. This is "supposed" to be a hobby. > >> (This is why they run on a separate process) > That doesn't help, it just OOMs your system => dead, it needs resource > restrictions, > which is tricky to get right. > You're right. It needs a better thought out solution. A separate process is the bare minimum. Btw, have you looked if Tracker actually does any of this? >> My hostility was because the proposal ignores key points such as - >> >> * Indexing Speed >> * Search speed >> * Database size > => If you look at the bugs, people complain we are inferior and I see not > that the proposal ignores it, I just see not how to compare, given there are > no > hard facts that we are faster than e.g. tracker in any way. > Data can be gathered about it. Not all data is publicly available. >> * Ease of use with our existing components > My proposal did not change the interface at all, it has zero impact on "ease > of use". > >> * Ease of fixing problems in the code > My estimate would be: rewrite close to everything. Even the basic 64-bit int > id won't work > with 64-bit inodes, each DB call must be touched to check for errors, at each > place > one will need to check for potential inconsistencies and exit gracefully... > I don't follow why everything needs to be re-written? Am I missing something or do we just need to check for more errors and use a higher integer id? This certainly doesn't seem super trivial, but it sounds like less work than implementing a shim on top of Tracker. I could be wrong. >> >> Baloo has certain speed requirements if it is to be used with krunner, >> and we want instant feedback. This was an integral requirement. > I doubt e.g. tracker has different requirements, as it is used in similar > places by GNOME. > > But all that left besides, have you an proposal how to fixup the current > situation? > Are you willing to invest some work to fix the current issues or an idea what > would be a good way to tackle them? > I probably will not work more in Baloo. I'll have to investigate the problems a bit more. From the cursory look of this thread, it doesn't seem that the problems are that dire. But I may not be reading into it correctly. -- Vishesh Handa
Re: Scrap Baloo Thread Feedback
Hi, > Hey > > On Fri, Oct 7, 2016 at 6:34 PM, Christoph Cullmann > wrote: >> Hi, >> >>> On Fri, Oct 7, 2016 at 5:58 PM, Christoph Cullmann >>> wrote: >> >> 1) No handling of DB errors beside asserting >> 2) No handling of errors in the extractors (e.g. see the fixes I did, all >> extractors will need more of that) >> 3) No handling of NFS/large inodes/inconsistencies => crash >> >> In the end, in my opinion, you can rewrite close to all parts dealing with >> the >> DB or >> any other thing internally. If ever any thing gots inconsistent, ATM you are >> doomed, forever, >> if not by luck my new startup code deletes the index, then you live again >> until >> it is reindexed. >> >>> >> I am not sure, I am all for removing complete indexing and use a other >> indexer >> like tracker to exactly avoid the excurse into DB world and how to handle it >> in a safe way with close to zero person manpower. >> > > It's avoiding the problem and hoping for the best, without any experiments. That is not true. I did experiments and search works with tracker, but yes, a problem is tagging,+ which ATM doesn't work. Nor do I say that is a ready solution now, just a possibility to avoid having to maintain low level code with at most 1 person (how it looks ATM). And I don't propose to go that road now, but ATM I see nobody doing any other experiments. Besides, tracker is constantly maintained and used since >> 5 years: https://github.com/GNOME/tracker/graphs/contributors > >> >> => That is good that we agree, but I find it very astonishing that we use >> baloo >> in its >> current state more or less mandatory on all that systems were it by design >> will >> fail. >> >> (and it fails if you read the bugs) >> > > There is a certain amount of failure, but it's not "by-design". But > maybe I'm not seeing things clearly. You yourself stated that neither 32-bit issues nor NFS nor > 32-bit inodes have any error handling. And that seems to have been known even during design and still we have this now as a framework per default used by any Plasma installation on systems exactly featuring that without error checking. > >>> > > How about requirements such as resource consumption, ease of > integration, search speed are taken into consideration? Come on guys. > We're engineers over here. > What is the argument here? If you take a look at bugs.kde.org, you see that people are complaining about all of that with baloo. I see no evidence nowhere that e.g. baloo is "superior" to what GNOME uses or any other solution (perhaps beside nepomuk, ok...). > > What tests have been to obtain the evidence? What tests have been done to obtain the inverse evidence? I only hear here the complaint about not taking requirements like resource consumption or speed into account, but there is ATM zero evidence that e.g. tracker is slower. And yes, there are "it hogs" 100% memory or time bugs open, thought you can hardly reproduce them as people are somehow scared to pack their home and send it to us. Not that a lot of that bugs got touched at all in Bugzilla. > >> >>> >>> Yup, you have. It's awesome. I no longer have the motivation to work on >>> Baloo. >> Thanks, but that makes me very sad, btw. >> Baloo came up to replace nepomuk, which was dead because it had too many bugs >> and all maintainers left. >> Now we have baloo, which has many bugs, some even by design, and the >> maintainer >> left, too. >> > > Actually, Nepomuk was not dead. I was maintaining it. I killed it > because it had too many structural problems. > > This is how the open source world works. People work on projects and > when it no longer scratches their itch (I no longer use Baloo), they > loose interest. This is "supposed" to be a hobby. That is ok, to see it as hobby. But I am a bit unnerved that one proposes this as the generic index solution for our desktop, which should be stable, if nothing else, and knows that it has severe limitations that are not handled (see above). I would have assumed that at least the known "can't work here' cases are handled in a graceful way. And given already one of the first things main.cpp of baloo_file does is: // HACK: Untill we start using lmdb with robust mutex support. We're just going to remove // the lock manually in the baloo_file process. QFile::remove(path + "/index-lock"); that doesn't leave high hopes, sorry. And the typical error check is: void MTimeDB::put(quint32 mtime, quint64 docId) { Q_ASSERT(mtime > 0); Q_ASSERT(docId > 0); MDB_val key; key.mv_size = sizeof(quint32); key.mv_data = static_cast(&mtime); MDB_val val; val.mv_size = sizeof(quint64); val.mv_data = static_cast(&docId); int rc = mdb_put(m_txn, m_dbi, &key, &val, 0); Q_ASSERT_X(rc == 0, "MTimeDB::put", mdb_strerror(rc)); } without any way to pass an error to the outside, nor any error handling code at the outside, as
Re: Scrap Baloo Thread Feedback
On Fri, Oct 7, 2016 at 6:20 PM, Aleix Pol wrote: >>> >> >> I don't understand why all framework discussions must happen on the >> same list. It just adds to a crazy amount of noise, which one then >> needs to parse through. > > Arguing that it should be elsewhere because you'd like to ignore the > rest of the traffic in kde-frameworks doesn't sound very constructive, > especially considering how they're the "noise" that actually improves > the frameworks. > > Maybe you can better configure your e-mail client differently so we > can focus on the issue at matter? This is not about how it should be. I'm informing them why it was chosen to be somewhere else. This decision can be changed. Frameworks collectively may or may not improve by having everything in one place. Lets not treat it as a axiom. An analogy could be that we get commit emails, but we get to choose which projects we are interested in. We don't make everyone subscribe to kde-commits, and then put their own complex filters on top. -- Vishesh Handa
Re: Scrap Baloo Thread Feedback
On Fri, Oct 7, 2016 at 6:14 PM, Christoph Cullmann wrote: >>> >> >> I don't understand why all framework discussions must happen on the >> same list. It just adds to a crazy amount of noise, which one then >> needs to parse through. > > If you would have baloo-devel I could understand that point, > but not with some other generic mailing list like kde-devel which > has the same amount of noise and is not even dedicated to 'frameworks' > or 'baloo'. If you guys plans to use frameworks devel, then please change the review requests. It was just too much noise for me, and I found the noise/signal ratio way lower in kde-devel. Baloo-devel was specifically not chosen as it would just an another silo in kde. Nepomuk used to suffer from that. -- Vishesh Handa
Re: Scrap Baloo Thread Feedback
Hi, > On Fri, Oct 7, 2016 at 5:58 PM, Christoph Cullmann > wrote: >> FYI, as my mail is in moderation queue on kde-devel >> >> - Weitergeleitete Mail - >> Von: "cullmann" >> An: "kde-frameworks-devel" >> CC: "kde-devel" >> Gesendet: Freitag, 7. Oktober 2016 17:56:35 >> Betreff: Re: Scrap Baloo Thread Feedback >> >> Hi, >> >>> Hey guys >>> >>> I was told there is a thread about scrapping Baloo. All Baloo >>> discussion used to happen on kde-devel and that's where the review >>> requests go. It's the only reason I am still subscribed to kde-devel. >> That is nice, but given baloo is a framework, that was unexpected, sorry. >> >>> >>> I must say, the thread is overall quite disappointing. There seems to >>> be no scientific or rationale cost based analysis of this. How about a >>> list of requirements and priorities are drawn up and then possible >>> solutions are evaluated according to it? >> >> Actually, the bugs.kde.org page tells you the facts: The bug number >> was constant increasing since > 1 year. The thread lists some other facts >> what is wrong ATM and should be fixed. >> > > It lists some of the facts. Not all. > > Of course the bug number is increasing. I am no longer pruning it. Are > the number of unique bugs increasing? What kind of users are being > affected by these bugs? Is it just people with specific kinds of > files? There is a lot of information that the bug tracker does not > cover. Lets also take the uncertainty into account, and then try to > mitigate it. In the end, most bugs boil down to: 1) No handling of DB errors beside asserting 2) No handling of errors in the extractors (e.g. see the fixes I did, all extractors will need more of that) 3) No handling of NFS/large inodes/inconsistencies => crash In the end, in my opinion, you can rewrite close to all parts dealing with the DB or any other thing internally. If ever any thing gots inconsistent, ATM you are doomed, forever, if not by luck my new startup code deletes the index, then you live again until it is reindexed. > >> And to replace baloo with something else based for example on tracker was >> just >> one >> proposal. >> >> An other was to fix baloo + port it to an other database. > > Right, "another database". > > Typically one would expect the problems and features of our current > database to be evaluated against the others. This was an exercise that > I did and chose LMDB. > > What are the requirements for the database? I am not sure, I am all for removing complete indexing and use a other indexer like tracker to exactly avoid the excurse into DB world and how to handle it in a safe way with close to zero person manpower. And I oppose the idea to write an own DB. > >> >>> >>> Right now, random requirements such as NFS and 32bit systems are >>> coming up. Are these really that important? I specifically designed >>> Baloo to not care about both network mounts and 32-bit systems. Yes, >>> Baloo has bugs and it won't handle more than 32bit-inodes. These >>> things, as all others, can be fixed. It's really a question of what is >>> important. Lets not target the outliers. Many of these decisions were >>> deliberately taken. >> That are no random requirements, sorry, you could call it random >> restrictions, >> too. >> That is not that productive, or? >> >> 1) 32-bit systems are still there and if that is a design decision to NOT >> support them, >> that is ok, but then bad for Plasma, no official support for 32-bit systems, >> baloo is IMHO >> the only framework with such requirements. And I see not that we have hinted >> any >> distro >> that they shall not compile it for 32-bit. >> > > * 32-bit systems can be supported. But it will be much inferior as the > database size will need to be limited. I never got around to doing > this. > * Plasma still supports 32-bit systems, but file indexing may be > limited. It's the same way that compositing may be disabled if you > have old hardware. > * All that being a frameworks means is that there are ABI / API > guarantees and a release schedule. Not all frameworks target all > systems. > >> 2) No NFS: Ok, fair game, but then, it should check that and disable itself >> completely if $HOME >> where the db is stored is a NFS, can live with that, too, but not with the >> current "we random >> crash" behavior. => That is a user experience we don't want
Re: Scrap Baloo Thread Feedback
On Friday, 7 October 2016 18:27:30 CEST Vishesh Handa wrote: > On Fri, Oct 7, 2016 at 6:20 PM, Aleix Pol wrote: > >> I don't understand why all framework discussions must happen on the > >> same list. It just adds to a crazy amount of noise, which one then > >> needs to parse through. > > > > Arguing that it should be elsewhere because you'd like to ignore the > > rest of the traffic in kde-frameworks doesn't sound very constructive, > > especially considering how they're the "noise" that actually improves > > the frameworks. > > > > Maybe you can better configure your e-mail client differently so we > > can focus on the issue at matter? > > This is not about how it should be. I'm informing them why it was > chosen to be somewhere else. This decision can be changed. > > Frameworks collectively may or may not improve by having everything in > one place. Lets not treat it as a axiom. > > An analogy could be that we get commit emails, but we get to choose > which projects we are interested in. We don't make everyone subscribe > to kde-commits, and then put their own complex filters on top. We are moving out from the main point of the discussion, but I'd like to point out that the support for topics in mailman[1] covers this use case. Not sure how to make reviewboard or phabricator add a topic to the notification email, though. (Also, we did not have a final decision whether we should go back to kde-core- devel for Frameworks-related topic). [1] http://www.list.org/mailman-member/node29.html Ciao -- Luigi
Re: Scrap Baloo Thread Feedback
On Fri, Oct 7, 2016 at 6:01 PM, Vishesh Handa wrote: > On Fri, Oct 7, 2016 at 5:57 PM, Kevin Funk wrote: >> On Friday, 7 October 2016 17:24:26 CEST Vishesh Handa wrote: >>> Hey guys >>> >>> I was told there is a thread about scrapping Baloo. All Baloo >>> discussion used to happen on kde-devel and that's where the review >>> requests go. It's the only reason I am still subscribed to kde-devel. >> >> Heya, >> >> Baloo is a framework nowadays, therefore it totally makes sense to have the >> discussion on kde-framework-devel. >> >> There's been tons of discussion around Baloo on kde-framework-devel already. >> kde-frameworks-devel is also where the CI messages for the baloo repo go to. >> >> It likely makes sense for you to subscribe, no? >> > > I don't understand why all framework discussions must happen on the > same list. It just adds to a crazy amount of noise, which one then > needs to parse through. Arguing that it should be elsewhere because you'd like to ignore the rest of the traffic in kde-frameworks doesn't sound very constructive, especially considering how they're the "noise" that actually improves the frameworks. Maybe you can better configure your e-mail client differently so we can focus on the issue at matter? Aleix
Re: Scrap Baloo Thread Feedback
Hi, > On Fri, Oct 7, 2016 at 6:14 PM, Christoph Cullmann > wrote: >>> >>> I don't understand why all framework discussions must happen on the >>> same list. It just adds to a crazy amount of noise, which one then >>> needs to parse through. >> >> If you would have baloo-devel I could understand that point, >> but not with some other generic mailing list like kde-devel which >> has the same amount of noise and is not even dedicated to 'frameworks' >> or 'baloo'. > > If you guys plans to use frameworks devel, then please change the > review requests. > > It was just too much noise for me, and I found the noise/signal ratio > way lower in kde-devel. Baloo-devel was specifically not chosen as it > would just an another silo in kde. Nepomuk used to suffer from that. I use the power of e-mail filters to filter the review requests in a subfolder, I think others might do the same. (same for CI) I don't see a point in changing that policy if all others can live with it. Greetings Christoph -- - Dr.-Ing. Christoph Cullmann - AbsInt Angewandte Informatik GmbH Email: cullm...@absint.com Science Park 1 Tel: +49-681-38360-22 66123 Saarbrücken Fax: +49-681-38360-20 GERMANYWWW: http://www.AbsInt.com Geschäftsführung: Dr.-Ing. Christian Ferdinand Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234
Re: Scrap Baloo Thread Feedback
Hi, > On Fri, Oct 7, 2016 at 5:57 PM, Kevin Funk wrote: >> On Friday, 7 October 2016 17:24:26 CEST Vishesh Handa wrote: >>> Hey guys >>> >>> I was told there is a thread about scrapping Baloo. All Baloo >>> discussion used to happen on kde-devel and that's where the review >>> requests go. It's the only reason I am still subscribed to kde-devel. >> >> Heya, >> >> Baloo is a framework nowadays, therefore it totally makes sense to have the >> discussion on kde-framework-devel. >> >> There's been tons of discussion around Baloo on kde-framework-devel already. >> kde-frameworks-devel is also where the CI messages for the baloo repo go to. >> >> It likely makes sense for you to subscribe, no? >> > > I don't understand why all framework discussions must happen on the > same list. It just adds to a crazy amount of noise, which one then > needs to parse through. If you would have baloo-devel I could understand that point, but not with some other generic mailing list like kde-devel which has the same amount of noise and is not even dedicated to 'frameworks' or 'baloo'. Greetings Christoph -- - Dr.-Ing. Christoph Cullmann - AbsInt Angewandte Informatik GmbH Email: cullm...@absint.com Science Park 1 Tel: +49-681-38360-22 66123 Saarbrücken Fax: +49-681-38360-20 GERMANYWWW: http://www.AbsInt.com Geschäftsführung: Dr.-Ing. Christian Ferdinand Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234
Re: Scrap Baloo Thread Feedback
On Fri, Oct 7, 2016 at 5:57 PM, Kevin Funk wrote: > On Friday, 7 October 2016 17:24:26 CEST Vishesh Handa wrote: >> Hey guys >> >> I was told there is a thread about scrapping Baloo. All Baloo >> discussion used to happen on kde-devel and that's where the review >> requests go. It's the only reason I am still subscribed to kde-devel. > > Heya, > > Baloo is a framework nowadays, therefore it totally makes sense to have the > discussion on kde-framework-devel. > > There's been tons of discussion around Baloo on kde-framework-devel already. > kde-frameworks-devel is also where the CI messages for the baloo repo go to. > > It likely makes sense for you to subscribe, no? > I don't understand why all framework discussions must happen on the same list. It just adds to a crazy amount of noise, which one then needs to parse through. > Cheers, > Kevin > >> (snip) > > > -- > Kevin Funk | kf...@kde.org | http://kfunk.org
Re: Scrap Baloo Thread Feedback
On Friday, 7 October 2016 17:24:26 CEST Vishesh Handa wrote: > Hey guys > > I was told there is a thread about scrapping Baloo. All Baloo > discussion used to happen on kde-devel and that's where the review > requests go. It's the only reason I am still subscribed to kde-devel. Heya, Baloo is a framework nowadays, therefore it totally makes sense to have the discussion on kde-framework-devel. There's been tons of discussion around Baloo on kde-framework-devel already. kde-frameworks-devel is also where the CI messages for the baloo repo go to. It likely makes sense for you to subscribe, no? Cheers, Kevin > (snip) -- Kevin Funk | kf...@kde.org | http://kfunk.org signature.asc Description: This is a digitally signed message part.
Re: Scrap Baloo Thread Feedback
Hi, > Hey guys > > I was told there is a thread about scrapping Baloo. All Baloo > discussion used to happen on kde-devel and that's where the review > requests go. It's the only reason I am still subscribed to kde-devel. That is nice, but given baloo is a framework, that was unexpected, sorry. > > I must say, the thread is overall quite disappointing. There seems to > be no scientific or rationale cost based analysis of this. How about a > list of requirements and priorities are drawn up and then possible > solutions are evaluated according to it? Actually, the bugs.kde.org page tells you the facts: The bug number was constant increasing since > 1 year. The thread lists some other facts what is wrong ATM and should be fixed. And to replace baloo with something else based for example on tracker was just one proposal. An other was to fix baloo + port it to an other database. > > Right now, random requirements such as NFS and 32bit systems are > coming up. Are these really that important? I specifically designed > Baloo to not care about both network mounts and 32-bit systems. Yes, > Baloo has bugs and it won't handle more than 32bit-inodes. These > things, as all others, can be fixed. It's really a question of what is > important. Lets not target the outliers. Many of these decisions were > deliberately taken. That are no random requirements, sorry, you could call it random restrictions, too. That is not that productive, or? 1) 32-bit systems are still there and if that is a design decision to NOT support them, that is ok, but then bad for Plasma, no official support for 32-bit systems, baloo is IMHO the only framework with such requirements. And I see not that we have hinted any distro that they shall not compile it for 32-bit. 2) No NFS: Ok, fair game, but then, it should check that and disable itself completely if $HOME where the db is stored is a NFS, can live with that, too, but not with the current "we random crash" behavior. => That is a user experience we don't want, or? 3) > 32-bit inodes: That is normal and should work, but even if it should not: Atm you get inconsistent and then later assertion fails or crashs. => I can live with all restrictions but the current handling of them, that always ends in "crash" is IMHO not that acceptable. But that is "my" opinion, that might vary in the eyes of others. > > How about requirements such as resource consumption, ease of > integration, search speed are taken into consideration? Come on guys. > We're engineers over here. What is the argument here? If you take a look at bugs.kde.org, you see that people are complaining about all of that with baloo. I see no evidence nowhere that e.g. baloo is "superior" to what GNOME uses or any other solution (perhaps beside nepomuk, ok...). I fixed in a few days more bugs than were fixed in 1 year and triaged more than ever, still a lot is to be done. (and I did really not do a lot, just remove things like 'self destruct if index > 5GB' or 'crash for ever on db corruption') A graph tells more than words: https://bugs.kde.org/reports.cgi?product=frameworks-baloo&output=show_chart&datasets=CONFIRMED&datasets=ASSIGNED&datasets=REOPENED&datasets=UNCONFIRMED&datasets=RESOLVED&banner=1 Given the current open bugs, one will need to: 1) review all extractors, they have still close to zero error handling and will just crash or OOM you on bad files 2) review + fix the complete data base handling to handle errors and perhaps swap the DB 3) fix the indexer to have some resource limits to avoid OOM and Co. if e..g extractors fail ... Therefore there was my proposal, given we lack manpower, to implement baloo API on top of e.g. tracker to avoid all this and let tracker handle that. To check if that is at all feasible, I did some quick and dirty implementation (still modulo filling of the metadata in the results + tagging, which is a problem, but that was only to see if e.g. search works) https://quickgit.kde.org/?p=clones%2Fbaloo%2Fcullmann%2Ftbaloo.git That is just a proposal and then I started the discussion. Until now, we have one other proposal, by Boudhayan, to fixup baloo. > > (If the discussion continues on kde-frameworks-devel, I probably won't see it) I won't see it on kde-devel, please, frameworks related stuff should really be discussed on the frameworks list. Greetings Christoph -- - Dr.-Ing. Christoph Cullmann - AbsInt Angewandte Informatik GmbH Email: cullm...@absint.com Science Park 1 Tel: +49-681-38360-22 66123 Saarbrücken Fax: +49-681-38360-20 GERMANYWWW: http://www.AbsInt.com Geschäftsführung: Dr.-Ing. Christian Ferdinand Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234