Re: Thread related crash on failed dive computer download
On Dienstag, 31. Oktober 2017 21:00:37 CET Berthold Stoeger wrote: > So in the attachment is a different approach: don't output errors if not in > the main thread. Someone who generates a thread is responsible of flushing > the errors later. Sounds much less brittle to me. It works in my limited test-case and I generated a pull request: https://github.com/Subsurface-divelog/subsurface/pull/753 Fixing the race condition should not be too hard. Berthold ___ subsurface mailing list subsurface@subsurface-divelog.org http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface
Re: Thread related crash on failed dive computer download
Hi Dirk, On Dienstag, 31. Oktober 2017 20:41:32 CET Dirk Hohndel wrote: > On Tue, Oct 31, 2017 at 08:10:32PM +0100, Berthold Stoeger wrote: > > To be honest, apart from the errors mentioned above, I'm not even sure > > that > > the idea behind the code is sound. If you want, I can send an > > updated/cleaned up patch, but I have a bad feeling with this one. There > > probably should at least be a non-varargs version. > > The idea that the thread collects its messages and that the caller of the > thread then initiates their display when it is done seems sound to me. > > I'd really like a version that's good enough to use in 4.7.2 - and I'm > open to further refinement once that's released. I still think the approach is fundamentally broken/brittle. You have to identify which calls are made from the GUI thread and which are not. Who knows if I got the right report_error()s? What about errors that happen only in few cases? So in the attachment is a different approach: don't output errors if not in the main thread. Someone who generates a thread is responsible of flushing the errors later. Sounds much less brittle to me. This means that the membuffer in errorhelper.c has to be protected by a mutex or something (not an expert in threads). But that race was there before, wasn't it? What do you think? Bertholddiff --git a/desktop-widgets/downloadfromdivecomputer.cpp b/desktop-widgets/downloadfromdivecomputer.cpp index 637f6034..b875b4a7 100644 --- a/desktop-widgets/downloadfromdivecomputer.cpp +++ b/desktop-widgets/downloadfromdivecomputer.cpp @@ -220,6 +220,10 @@ void DownloadFromDCWidget::updateState(states state) // got an error else if (state == ERROR) { timer->stop(); + + // Show messages that worker thread produced. + MainWindow::instance()->showErrors(); + QMessageBox::critical(this, TITLE_OR_TEXT(tr("Error"), thread.error), QMessageBox::Ok); markChildrenAsEnabled(); progress_bar_text = ""; diff --git a/desktop-widgets/mainwindow.cpp b/desktop-widgets/mainwindow.cpp index 95495b66..2119e375 100644 --- a/desktop-widgets/mainwindow.cpp +++ b/desktop-widgets/mainwindow.cpp @@ -88,12 +88,25 @@ MainWindow *MainWindow::m_Instance = NULL; extern "C" void showErrorFromC() { + // Show errors only if we are running in the gui thread. + // If we're in the gui thread, let errors accumulate. + if (QThread::currentThread() != QCoreApplication::instance()->thread()) { + return; + } + MainWindow *mainwindow = MainWindow::instance(); if (mainwindow) { mainwindow->getNotificationWidget()->showNotification(get_error_string(), KMessageWidget::Error); } } +void MainWindow::showErrors() +{ + const char *error = get_error_string(); + if (error && error[0]) { + getNotificationWidget()->showNotification(error, KMessageWidget::Error); + } +} MainWindow::MainWindow() : QMainWindow(), actionNextDive(0), diff --git a/desktop-widgets/mainwindow.h b/desktop-widgets/mainwindow.h index edd3fb0c..50590aa2 100644 --- a/desktop-widgets/mainwindow.h +++ b/desktop-widgets/mainwindow.h @@ -89,6 +89,10 @@ public: void enableDisableCloudActions(); void setCheckedActionFilterTags(bool checked); + // Shows errors that have accumulated. + // Must be called from GUI thread. + void showErrors(); + private slots: /* file menu action */ ___ subsurface mailing list subsurface@subsurface-divelog.org http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface
Re: Thread related crash on failed dive computer download
On Tue, Oct 31, 2017 at 08:10:32PM +0100, Berthold Stoeger wrote: > > On Tue, Oct 31, 2017 at 03:52:28PM +0100, Berthold Stoeger wrote: > > > As a FYI, the patch attached, though without commit-message because it is > > > simply too ugly. It works for my particular test case, but... > > > > Why is it too ugly? > > Duplication of the va_args thing. At a place where it doesn't belong. Shrug. > Movement of C-style strings to QString back to C-style strings back to > QStrings. Oh my. Yes, that always bothers me when we transition through C and C++ code and back. We should try to keep them as just C strings for as long as possible. > No encapsulation whatsoever. C function defined in C++ header and declared > extern in C source. > Etc. :( Shrug. I'm trying to get a last minute fix for a bug that I introduced... > > > + errorList.setLocalData(QStringList()); > > > > So we initialize to an empty list. I'm unclear if there could be messages > > still in that local storage that haven't been delivered, yet. > > You're right, the local storage should be gone once the thread exits, even > though the thread object is reused. > > What should be cleared though is the "errors" member, because the object is > reused. > > BTW: Another bug: the code doesn't clear the membuffer. Oops. > Forget the std::move - it should be removed. This would be used in standard C+ > + where list assignment makes a deep copy. A std::move on the other hand just > replaces pointers. But as I just learned, Qt containers do copy on write, so > this is not necessary. I have no idea how they would react to a std::move. OK > > > +// Hack! for report_error_thread() > > > +#include "core/downloadfromdcthread.h" > > > > Why is that a hack? > > Because downloadfromdcthread.h should not be needed here? This should go into a helper .h file. > > > @@ -220,6 +220,9 @@ void DownloadFromDCWidget::updateState(states state) > > > > > > // got an error > > > else if (state == ERROR) { > > > > > > timer->stop(); > > > > > > + foreach(const QString , thread.errors) { > > > + report_message(s.toUtf8().data()); > > > + } > > > > This one I don't understand... why aren't these reported when we do the > > std::move() above? > > On completion, the thread moved the messages collected in the thread local > area to the "errors" member of the thread object. And here the main thread > displays the collected messages. > > To be honest, apart from the errors mentioned above, I'm not even sure that > the idea behind the code is sound. If you want, I can send an updated/cleaned > up patch, but I have a bad feeling with this one. There probably should at > least be a non-varargs version. The idea that the thread collects its messages and that the caller of the thread then initiates their display when it is done seems sound to me. I'd really like a version that's good enough to use in 4.7.2 - and I'm open to further refinement once that's released. /D ___ subsurface mailing list subsurface@subsurface-divelog.org http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface
Re: Thread related crash on failed dive computer download
On Dienstag, 31. Oktober 2017 19:25:03 CET Dirk Hohndel wrote: > On Tue, Oct 31, 2017 at 03:52:28PM +0100, Berthold Stoeger wrote: > > As a FYI, the patch attached, though without commit-message because it is > > simply too ugly. It works for my particular test case, but... > > Why is it too ugly? Duplication of the va_args thing. At a place where it doesn't belong. Movement of C-style strings to QString back to C-style strings back to QStrings. Oh my. No encapsulation whatsoever. C function defined in C++ header and declared extern in C source. Etc. :( > > + errorList.setLocalData(QStringList()); > > So we initialize to an empty list. I'm unclear if there could be messages > still in that local storage that haven't been delivered, yet. You're right, the local storage should be gone once the thread exits, even though the thread object is reused. What should be cleared though is the "errors" member, because the object is reused. BTW: Another bug: the code doesn't clear the membuffer. > > if (errorText) > > > > error = str_error(errorText, internalData->devname, > > internalData->vendor, internalData->product);> > > + errors = std::move(errorList.localData()); > > + > > Because this seems like it would ('move'?) clear out the local storage, > correct? Forget the std::move - it should be removed. This would be used in standard C+ + where list assignment makes a deep copy. A std::move on the other hand just replaces pointers. But as I just learned, Qt containers do copy on write, so this is not necessary. I have no idea how they would react to a std::move. > > + > > +// Dirty hack: report_error(...) crashed when not called from main > > thread. > > +// Thus, replace report_error(...) calls by report_error_thread(...) > > +// calls, which collect error messages in thread local storage. > > +extern "C" int report_error_thread(const char *fmt, ...); > > + > > Again, why is this a dirty hack? See above. > > +// Hack! for report_error_thread() > > +#include "core/downloadfromdcthread.h" > > Why is that a hack? Because downloadfromdcthread.h should not be needed here? > > @@ -220,6 +220,9 @@ void DownloadFromDCWidget::updateState(states state) > > > > // got an error > > else if (state == ERROR) { > > > > timer->stop(); > > > > + foreach(const QString , thread.errors) { > > + report_message(s.toUtf8().data()); > > + } > > This one I don't understand... why aren't these reported when we do the > std::move() above? On completion, the thread moved the messages collected in the thread local area to the "errors" member of the thread object. And here the main thread displays the collected messages. To be honest, apart from the errors mentioned above, I'm not even sure that the idea behind the code is sound. If you want, I can send an updated/cleaned up patch, but I have a bad feeling with this one. There probably should at least be a non-varargs version. Berthold ___ subsurface mailing list subsurface@subsurface-divelog.org http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface
Re: Thread related crash on failed dive computer download
On Tue, Oct 31, 2017 at 03:52:28PM +0100, Berthold Stoeger wrote: > Hi Dirk, > > On Dienstag, 31. Oktober 2017 13:52:45 CET you wrote: > > > On Oct 31, 2017, at 4:43 AM, Jan Mulderwrote: > > > > > > On 31-10-17 12:05, Jan Mulder wrote: > > > Sounds like a fundamental problem, with the new way of error reporting ... > > > but not sure ... > > Yes, that makes sense. Doing the showNotification there might fail. Instead > > we should just let the UI thread know that there is a new message to > > display. This may prevent 4.7.2 today as I won't have time to look at this > > until late this evening and then we might want to test it :-) > > > > Unless one of you can look at it while I'm on an airplane for the next two > > hours > > After a quick look it seems that report_error(...) is used quite liberally in > core/qt-ble.cpp, core/downloadfromthread.cpp and core/libdivecomputer.c, > which are all executed in DownloadThread context. Detangling this seems hard > as part of it is in C, other parts are callbacks from libdivecomputer, etc. Yes, the idea was to simply just stage errors (and warnings, frankly) and have the UI deal with them. Which we did a terrible job of in the past. My solution made this much more responsive - and broken because of the thread issue that I missed. > As a quick stop-gap measure I replaced the report_error(...) calls by > report_error_thread(...) calls, which collect the error messages in a thread > local storage. The collected error messages are then passed to the main > thread > via a member. I only had a minute to glance through the code, but it seems reasonable. > As a FYI, the patch attached, though without commit-message because it is > simply too ugly. It works for my particular test case, but... Why is it too ugly? /D > void DownloadThread::run() > { > + errorList.setLocalData(QStringList()); So we initialize to an empty list. I'm unclear if there could be messages still in that local storage that haven't been delivered, yet. > if (errorText) > error = str_error(errorText, internalData->devname, > internalData->vendor, internalData->product); > > + errors = std::move(errorList.localData()); > + Because this seems like it would ('move'?) clear out the local storage, correct? > + > +// Dirty hack: report_error(...) crashed when not called from main thread. > +// Thus, replace report_error(...) calls by report_error_thread(...) > +// calls, which collect error messages in thread local storage. > +extern "C" int report_error_thread(const char *fmt, ...); > + Again, why is this a dirty hack? > diff --git a/core/qt-ble.cpp b/core/qt-ble.cpp > index a72736d4..277e44c6 100644 > --- a/core/qt-ble.cpp > +++ b/core/qt-ble.cpp > @@ -18,6 +18,9 @@ > #include "core/qt-ble.h" > #include "core/btdiscovery.h" > > +// Hack! for report_error_thread() > +#include "core/downloadfromdcthread.h" Why is that a hack? > @@ -220,6 +220,9 @@ void DownloadFromDCWidget::updateState(states state) > // got an error > else if (state == ERROR) { > timer->stop(); > + foreach(const QString , thread.errors) { > + report_message(s.toUtf8().data()); > + } This one I don't understand... why aren't these reported when we do the std::move() above? /D ___ subsurface mailing list subsurface@subsurface-divelog.org http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface
Re: Thread related crash on failed dive computer download
Hi Dirk, On Dienstag, 31. Oktober 2017 13:52:45 CET you wrote: > > On Oct 31, 2017, at 4:43 AM, Jan Mulderwrote: > > > > On 31-10-17 12:05, Jan Mulder wrote: > > Sounds like a fundamental problem, with the new way of error reporting ... > > but not sure ... > Yes, that makes sense. Doing the showNotification there might fail. Instead > we should just let the UI thread know that there is a new message to > display. This may prevent 4.7.2 today as I won't have time to look at this > until late this evening and then we might want to test it :-) > > Unless one of you can look at it while I'm on an airplane for the next two > hours After a quick look it seems that report_error(...) is used quite liberally in core/qt-ble.cpp, core/downloadfromthread.cpp and core/libdivecomputer.c, which are all executed in DownloadThread context. Detangling this seems hard as part of it is in C, other parts are callbacks from libdivecomputer, etc. As a quick stop-gap measure I replaced the report_error(...) calls by report_error_thread(...) calls, which collect the error messages in a thread local storage. The collected error messages are then passed to the main thread via a member. As a FYI, the patch attached, though without commit-message because it is simply too ugly. It works for my particular test case, but... Berthold diff --git a/core/downloadfromdcthread.cpp b/core/downloadfromdcthread.cpp index e01de288..3ce4d7ab 100644 --- a/core/downloadfromdcthread.cpp +++ b/core/downloadfromdcthread.cpp @@ -1,8 +1,10 @@ #include "downloadfromdcthread.h" #include "core/libdivecomputer.h" +#include "core/membuffer.h" #include "core/subsurface-qt/SettingsObjectWrapper.h" #include #include +#include QStringList vendorList; QHash productList; @@ -10,6 +12,24 @@ static QHash mobileProductList; // BT, BLE or FTDI support QMap descriptorLookup; ConnectionListModel connectionListModel; +QThreadStorage errorList; + +#define VA_BUF(b, fmt) do { va_list args; va_start(args, fmt); put_vformat(b, fmt, args); va_end(args); } while (0) + +extern "C" +int report_error_thread(const char *fmt, ...) +{ + struct membuffer buf = { 0 }; + + /* Previous unprinted errors? Add a newline in between */ + VA_BUF(, fmt); + mb_cstring(); + + if(buf.len) + errorList.localData().append(QString(buf.buffer)); + return -1; +} + static QString str_error(const char *fmt, ...) { va_list args; @@ -27,6 +47,7 @@ DownloadThread::DownloadThread() void DownloadThread::run() { + errorList.setLocalData(QStringList()); auto internalData = m_data->internalData(); internalData->descriptor = descriptorLookup[m_data->vendor() + m_data->product()]; internalData->download_table = @@ -51,6 +72,8 @@ void DownloadThread::run() if (errorText) error = str_error(errorText, internalData->devname, internalData->vendor, internalData->product); + errors = std::move(errorList.localData()); + qDebug() << "Finishing the thread" << errorText << "dives downloaded" << downloadTable.nr; auto dcs = SettingsObjectWrapper::instance()->dive_computer_settings; dcs->setVendor(internalData->vendor); diff --git a/core/downloadfromdcthread.h b/core/downloadfromdcthread.h index ff4b8f39..2dd134ff 100644 --- a/core/downloadfromdcthread.h +++ b/core/downloadfromdcthread.h @@ -79,6 +79,7 @@ public: Q_INVOKABLE DCDeviceData *data(); QString error; + QStringList errors; private: DCDeviceData *m_data; @@ -101,4 +102,10 @@ extern QStringList vendorList; extern QHash productList; extern QMap descriptorLookup; extern ConnectionListModel connectionListModel; + +// Dirty hack: report_error(...) crashed when not called from main thread. +// Thus, replace report_error(...) calls by report_error_thread(...) +// calls, which collect error messages in thread local storage. +extern "C" int report_error_thread(const char *fmt, ...); + #endif diff --git a/core/libdivecomputer.c b/core/libdivecomputer.c index d8fe54e8..86456751 100644 --- a/core/libdivecomputer.c +++ b/core/libdivecomputer.c @@ -18,6 +18,8 @@ #include "libdivecomputer.h" #include "core/version.h" +int report_error_thread(const char *fmt, ...); + #if !defined(SSRF_LIBDC_VERSION) || SSRF_LIBDC_VERSION < 2 #pragma message "Subsurface requires a reasonably current version of the Subsurface-branch" #pragma message "of libdivecomputer (at least version 2 of our API)." @@ -106,10 +108,10 @@ static int parse_gasmixes(device_data_t *devdata, struct dive *dive, dc_parser_t if (rc == DC_STATUS_SUCCESS) { if (ntanks && ntanks < ngases) { shown_warning = true; - report_error("Warning: different number of gases (%d) and cylinders (%d)", ngases, ntanks); + report_error_thread("Warning: different number of gases (%d) and cylinders (%d)", ngases, ntanks); } else if (ntanks > ngases) { shown_warning = true; - report_error("Warning:
Re: Thread related crash on failed dive computer download
> On Oct 31, 2017, at 4:43 AM, Jan Mulderwrote: > > On 31-10-17 12:05, Jan Mulder wrote: > >> But Dirk has to worry, as it suspiciously close to the new error handling >> code that has just landed in master. > > for Dirk: > > from Qt 5.9.2 from build from source with debug: > > failed to connect to the controller 00:80:25:4A:0F:C2 with error "" > QObject: Cannot create children for a parent that is in a different thread. > (Parent is QLabel(0x562909c0), parent's thread is > QThread(0x55ea2cc0), current thread is DownloadThread(0x7fffcbc0) > ASSERT failure in QCoreApplication::sendEvent: "Cannot send events to objects > owned by a different thread. Current thread 0x0x7fffcbc0. Receiver '' (of > type 'QToolButton') was created in thread 0x0x55ea2cc0", file > kernel/qcoreapplication.cpp, line 563 > > Sounds like a fundamental problem, with the new way of error reporting ... > but not sure ... Yes, that makes sense. Doing the showNotification there might fail. Instead we should just let the UI thread know that there is a new message to display. This may prevent 4.7.2 today as I won't have time to look at this until late this evening and then we might want to test it :-) Unless one of you can look at it while I'm on an airplane for the next two hours /D ___ subsurface mailing list subsurface@subsurface-divelog.org http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface
Re: Thread related crash on failed dive computer download
On 31-10-17 12:05, Jan Mulder wrote: But Dirk has to worry, as it suspiciously close to the new error handling code that has just landed in master. for Dirk: from Qt 5.9.2 from build from source with debug: failed to connect to the controller 00:80:25:4A:0F:C2 with error "" QObject: Cannot create children for a parent that is in a different thread. (Parent is QLabel(0x562909c0), parent's thread is QThread(0x55ea2cc0), current thread is DownloadThread(0x7fffcbc0) ASSERT failure in QCoreApplication::sendEvent: "Cannot send events to objects owned by a different thread. Current thread 0x0x7fffcbc0. Receiver '' (of type 'QToolButton') was created in thread 0x0x55ea2cc0", file kernel/qcoreapplication.cpp, line 563 Sounds like a fundamental problem, with the new way of error reporting ... but not sure ... As compiled with debug it trips on the assert here. --jan ___ subsurface mailing list subsurface@subsurface-divelog.org http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface
Re: Thread related crash on failed dive computer download
Berthold, On 31-10-17 11:53, Berthold Stoeger wrote: Dear all, I tried to reset my git repository to head and now I'm getting a crash which I didn't get before. I tried "git clean -fx", but that didn't help. The crash happens when using invalid Bluetooth addresses. Call trace: - core/qt-ble.c l.326: report_error(...) - core/errorhelper.c l.43: error_cb() - desktop-widgets/mainwindow.cpp l.93: showNotification(...) This call doesn't return but produces the following output: The crash probably has nothing to do with the membuffer code (which itself looks scarily thread-unsafe): Replacing l. 93 of desktop-widgets/mainwindow.cpp by mainwindow->getNotificationWidget()->showNotification("test",...); still produces the crash. But mainwindow->getNotificationWidget()->showNotification("",...); does not crash. I have a feeling that I'm doing something obviously wrong, but I don't see it. Thank you, Don't worry. It is not your build environment as I can easily reproduce your crash. BLE with incorrect MAC address. But Dirk has to worry, as it suspiciously close to the new error handling code that has just landed in master. --jan ___ subsurface mailing list subsurface@subsurface-divelog.org http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface