Re: Thread related crash on failed dive computer download

2017-10-31 Thread Berthold Stoeger
On Dienstag, 31. Oktober 2017 21:00:37 CET Berthold Stoeger wrote:
> So in the attachment is a different approach: don't output errors if not in
> the main thread. Someone who generates a thread is responsible of flushing
> the errors later. Sounds much less brittle to me.

It works in my limited test-case and I generated a pull request:
https://github.com/Subsurface-divelog/subsurface/pull/753

Fixing the race condition should not be too hard.

Berthold

___
subsurface mailing list
subsurface@subsurface-divelog.org
http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface


Re: Thread related crash on failed dive computer download

2017-10-31 Thread Berthold Stoeger
Hi Dirk,

On Dienstag, 31. Oktober 2017 20:41:32 CET Dirk Hohndel wrote:
> On Tue, Oct 31, 2017 at 08:10:32PM +0100, Berthold Stoeger wrote:
> > To be honest, apart from the errors mentioned above, I'm not even sure
> > that
> > the idea behind the code is sound. If you want, I can send an
> > updated/cleaned up patch, but I have a bad feeling with this one. There
> > probably should at least be a non-varargs version.
> 
> The idea that the thread collects its messages and that the caller of the
> thread then initiates their display when it is done seems sound to me.
> 
> I'd really like a version that's good enough to use in 4.7.2 - and I'm
> open to further refinement once that's released.

I still think the approach is fundamentally broken/brittle. You have to 
identify which calls are made from the GUI thread and which are not. Who knows 
if I got the right report_error()s? What about errors that happen only in few 
cases?

So in the attachment is a different approach: don't output errors if not in 
the main thread. Someone who generates a thread is responsible of flushing the 
errors later. Sounds much less brittle to me.

This means that the membuffer in errorhelper.c has to be protected by a mutex 
or something (not an expert in threads). But that race was there before, 
wasn't it?

What do you think?

Bertholddiff --git a/desktop-widgets/downloadfromdivecomputer.cpp b/desktop-widgets/downloadfromdivecomputer.cpp
index 637f6034..b875b4a7 100644
--- a/desktop-widgets/downloadfromdivecomputer.cpp
+++ b/desktop-widgets/downloadfromdivecomputer.cpp
@@ -220,6 +220,10 @@ void DownloadFromDCWidget::updateState(states state)
 	// got an error
 	else if (state == ERROR) {
 		timer->stop();
+
+		// Show messages that worker thread produced.
+		MainWindow::instance()->showErrors();
+
 		QMessageBox::critical(this, TITLE_OR_TEXT(tr("Error"), thread.error), QMessageBox::Ok);
 		markChildrenAsEnabled();
 		progress_bar_text = "";
diff --git a/desktop-widgets/mainwindow.cpp b/desktop-widgets/mainwindow.cpp
index 95495b66..2119e375 100644
--- a/desktop-widgets/mainwindow.cpp
+++ b/desktop-widgets/mainwindow.cpp
@@ -88,12 +88,25 @@ MainWindow *MainWindow::m_Instance = NULL;
 
 extern "C" void showErrorFromC()
 {
+	// Show errors only if we are running in the gui thread.
+	// If we're in the gui thread, let errors accumulate.
+	if (QThread::currentThread() != QCoreApplication::instance()->thread()) {
+		return;
+	}
+
 	MainWindow *mainwindow = MainWindow::instance();
 	if (mainwindow) {
 		mainwindow->getNotificationWidget()->showNotification(get_error_string(), KMessageWidget::Error);
 	}
 }
 
+void MainWindow::showErrors()
+{
+	const char *error = get_error_string();
+	if (error && error[0]) {
+		getNotificationWidget()->showNotification(error, KMessageWidget::Error);
+	}
+}
 
 MainWindow::MainWindow() : QMainWindow(),
 	actionNextDive(0),
diff --git a/desktop-widgets/mainwindow.h b/desktop-widgets/mainwindow.h
index edd3fb0c..50590aa2 100644
--- a/desktop-widgets/mainwindow.h
+++ b/desktop-widgets/mainwindow.h
@@ -89,6 +89,10 @@ public:
 	void enableDisableCloudActions();
 	void setCheckedActionFilterTags(bool checked);
 
+	// Shows errors that have accumulated.
+	// Must be called from GUI thread.
+	void showErrors();
+
 private
 slots:
 	/* file menu action */
___
subsurface mailing list
subsurface@subsurface-divelog.org
http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface


Re: Thread related crash on failed dive computer download

2017-10-31 Thread Dirk Hohndel
On Tue, Oct 31, 2017 at 08:10:32PM +0100, Berthold Stoeger wrote:
> > On Tue, Oct 31, 2017 at 03:52:28PM +0100, Berthold Stoeger wrote:
> > > As a FYI, the patch attached, though without commit-message because it is
> > > simply too ugly. It works for my particular test case, but...
> > 
> > Why is it too ugly?
> 
> Duplication of the va_args thing. At a place where it doesn't belong.

Shrug.

> Movement of C-style strings to QString back to C-style strings back to 
> QStrings. Oh my.

Yes, that always bothers me when we transition through C and C++ code and
back. We should try to keep them as just C strings for as long as
possible.

> No encapsulation whatsoever. C function defined in C++ header and declared 
> extern in C source.
> Etc. :(

Shrug. I'm trying to get a last minute fix for a bug that I introduced...

> > > + errorList.setLocalData(QStringList());
> > 
> > So we initialize to an empty list. I'm unclear if there could be messages
> > still in that local storage that haven't been delivered, yet.
> 
> You're right, the local storage should be gone once the thread exits, even 
> though the thread object is reused.
> 
> What should be cleared though is the "errors" member, because the object is 
> reused.
> 
> BTW: Another bug: the code doesn't clear the membuffer.

Oops.

> Forget the std::move - it should be removed. This would be used in standard C+
> + where list assignment makes a deep copy. A std::move on the other hand just 
> replaces pointers. But as I just learned, Qt containers do copy on write, so 
> this is not necessary. I have no idea how they would react to a std::move.

OK

> > > +// Hack! for report_error_thread()
> > > +#include "core/downloadfromdcthread.h"
> > 
> > Why is that a hack?
> 
> Because downloadfromdcthread.h should not be needed here?

This should go into a helper .h file.

> > > @@ -220,6 +220,9 @@ void DownloadFromDCWidget::updateState(states state)
> > > 
> > >   // got an error
> > >   else if (state == ERROR) {
> > >   
> > >   timer->stop();
> > > 
> > > + foreach(const QString , thread.errors) {
> > > + report_message(s.toUtf8().data());
> > > + }
> > 
> > This one I don't understand... why aren't these reported when we do the
> > std::move() above?
> 
> On completion, the thread moved the messages collected in the thread local 
> area to the "errors" member of the thread object. And here the main thread 
> displays the collected messages.
> 
> To be honest, apart from the errors mentioned above, I'm not even sure that 
> the idea behind the code is sound. If you want, I can send an updated/cleaned 
> up patch, but I have a bad feeling with this one. There probably should at 
> least be a non-varargs version.

The idea that the thread collects its messages and that the caller of the
thread then initiates their display when it is done seems sound to me.

I'd really like a version that's good enough to use in 4.7.2 - and I'm
open to further refinement once that's released.

/D
___
subsurface mailing list
subsurface@subsurface-divelog.org
http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface


Re: Thread related crash on failed dive computer download

2017-10-31 Thread Berthold Stoeger
On Dienstag, 31. Oktober 2017 19:25:03 CET Dirk Hohndel wrote:

> On Tue, Oct 31, 2017 at 03:52:28PM +0100, Berthold Stoeger wrote:
> > As a FYI, the patch attached, though without commit-message because it is
> > simply too ugly. It works for my particular test case, but...
> 
> Why is it too ugly?

Duplication of the va_args thing. At a place where it doesn't belong.
Movement of C-style strings to QString back to C-style strings back to 
QStrings. Oh my.
No encapsulation whatsoever. C function defined in C++ header and declared 
extern in C source.
Etc. :(

> > +   errorList.setLocalData(QStringList());
> 
> So we initialize to an empty list. I'm unclear if there could be messages
> still in that local storage that haven't been delivered, yet.

You're right, the local storage should be gone once the thread exits, even 
though the thread object is reused.

What should be cleared though is the "errors" member, because the object is 
reused.

BTW: Another bug: the code doesn't clear the membuffer.

> > if (errorText)
> > 
> > error = str_error(errorText, internalData->devname,
> > internalData->vendor, internalData->product);> 
> > +   errors = std::move(errorList.localData());
> > +
>
> Because this seems like it would ('move'?) clear out the local storage,
> correct?

Forget the std::move - it should be removed. This would be used in standard C+
+ where list assignment makes a deep copy. A std::move on the other hand just 
replaces pointers. But as I just learned, Qt containers do copy on write, so 
this is not necessary. I have no idea how they would react to a std::move.

> > +
> > +// Dirty hack: report_error(...) crashed when not called from main
> > thread.
> > +// Thus, replace report_error(...) calls by report_error_thread(...)
> > +// calls, which collect error messages in thread local storage.
> > +extern "C" int report_error_thread(const char *fmt, ...);
> > +
> 
> Again, why is this a dirty hack?

See above.

> > +// Hack! for report_error_thread()
> > +#include "core/downloadfromdcthread.h"
> 
> Why is that a hack?

Because downloadfromdcthread.h should not be needed here?

> > @@ -220,6 +220,9 @@ void DownloadFromDCWidget::updateState(states state)
> > 
> > // got an error
> > else if (state == ERROR) {
> > 
> > timer->stop();
> > 
> > +   foreach(const QString , thread.errors) {
> > +   report_message(s.toUtf8().data());
> > +   }
> 
> This one I don't understand... why aren't these reported when we do the
> std::move() above?

On completion, the thread moved the messages collected in the thread local 
area to the "errors" member of the thread object. And here the main thread 
displays the collected messages.

To be honest, apart from the errors mentioned above, I'm not even sure that 
the idea behind the code is sound. If you want, I can send an updated/cleaned 
up patch, but I have a bad feeling with this one. There probably should at 
least be a non-varargs version.

Berthold
___
subsurface mailing list
subsurface@subsurface-divelog.org
http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface


Re: Thread related crash on failed dive computer download

2017-10-31 Thread Dirk Hohndel
On Tue, Oct 31, 2017 at 03:52:28PM +0100, Berthold Stoeger wrote:
> Hi Dirk,
> 
> On Dienstag, 31. Oktober 2017 13:52:45 CET you wrote:
> > > On Oct 31, 2017, at 4:43 AM, Jan Mulder  wrote:
> > > 
> > > On 31-10-17 12:05, Jan Mulder wrote:
> > > Sounds like a fundamental problem, with the new way of error reporting ...
> > > but not sure ...
> > Yes, that makes sense. Doing the showNotification there might fail. Instead
> > we should just let the UI thread know that there is a new message to
> > display. This may prevent 4.7.2 today as I won't have time to look at this
> > until late this evening and then we might want to test it :-)
> > 
> > Unless one of you can look at it while I'm on an airplane for the next two
> > hours
> 
> After a quick look it seems that report_error(...) is used quite liberally in 
> core/qt-ble.cpp, core/downloadfromthread.cpp and core/libdivecomputer.c,
> which are all executed in DownloadThread context. Detangling this seems hard 
> as part of it is in C, other parts are callbacks from libdivecomputer, etc.

Yes, the idea was to simply just stage errors (and warnings, frankly) and
have the UI deal with them. Which we did a terrible job of in the past.
My solution made this much more responsive - and broken because of the
thread issue that I missed.

> As a quick stop-gap measure I replaced the report_error(...) calls by 
> report_error_thread(...) calls, which collect the error messages in a thread 
> local storage. The collected error messages are then passed to the main 
> thread 
> via a member.

I only had a minute to glance through the code, but it seems reasonable.

> As a FYI, the patch attached, though without commit-message because it is 
> simply too ugly. It works for my particular test case, but...

Why is it too ugly?

/D

>  void DownloadThread::run()
>  {
> + errorList.setLocalData(QStringList());

So we initialize to an empty list. I'm unclear if there could be messages
still in that local storage that haven't been delivered, yet.

>   if (errorText)
>   error = str_error(errorText, internalData->devname, 
> internalData->vendor, internalData->product);
>  
> + errors = std::move(errorList.localData());
> +

Because this seems like it would ('move'?) clear out the local storage,
correct?

> +
> +// Dirty hack: report_error(...) crashed when not called from main thread.
> +// Thus, replace report_error(...) calls by report_error_thread(...)
> +// calls, which collect error messages in thread local storage.
> +extern "C" int report_error_thread(const char *fmt, ...);
> +

Again, why is this a dirty hack?

> diff --git a/core/qt-ble.cpp b/core/qt-ble.cpp
> index a72736d4..277e44c6 100644
> --- a/core/qt-ble.cpp
> +++ b/core/qt-ble.cpp
> @@ -18,6 +18,9 @@
>  #include "core/qt-ble.h"
>  #include "core/btdiscovery.h"
>  
> +// Hack! for report_error_thread()
> +#include "core/downloadfromdcthread.h"

Why is that a hack?

> @@ -220,6 +220,9 @@ void DownloadFromDCWidget::updateState(states state)
>   // got an error
>   else if (state == ERROR) {
>   timer->stop();
> + foreach(const QString , thread.errors) {
> + report_message(s.toUtf8().data());
> + }

This one I don't understand... why aren't these reported when we do the
std::move() above?

/D
___
subsurface mailing list
subsurface@subsurface-divelog.org
http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface


Re: Thread related crash on failed dive computer download

2017-10-31 Thread Berthold Stoeger
Hi Dirk,

On Dienstag, 31. Oktober 2017 13:52:45 CET you wrote:
> > On Oct 31, 2017, at 4:43 AM, Jan Mulder  wrote:
> > 
> > On 31-10-17 12:05, Jan Mulder wrote:
> > Sounds like a fundamental problem, with the new way of error reporting ...
> > but not sure ...
> Yes, that makes sense. Doing the showNotification there might fail. Instead
> we should just let the UI thread know that there is a new message to
> display. This may prevent 4.7.2 today as I won't have time to look at this
> until late this evening and then we might want to test it :-)
> 
> Unless one of you can look at it while I'm on an airplane for the next two
> hours

After a quick look it seems that report_error(...) is used quite liberally in 
core/qt-ble.cpp, core/downloadfromthread.cpp and core/libdivecomputer.c,
which are all executed in DownloadThread context. Detangling this seems hard 
as part of it is in C, other parts are callbacks from libdivecomputer, etc.

As a quick stop-gap measure I replaced the report_error(...) calls by 
report_error_thread(...) calls, which collect the error messages in a thread 
local storage. The collected error messages are then passed to the main thread 
via a member.

As a FYI, the patch attached, though without commit-message because it is 
simply too ugly. It works for my particular test case, but...

Berthold

diff --git a/core/downloadfromdcthread.cpp b/core/downloadfromdcthread.cpp
index e01de288..3ce4d7ab 100644
--- a/core/downloadfromdcthread.cpp
+++ b/core/downloadfromdcthread.cpp
@@ -1,8 +1,10 @@
 #include "downloadfromdcthread.h"
 #include "core/libdivecomputer.h"
+#include "core/membuffer.h"
 #include "core/subsurface-qt/SettingsObjectWrapper.h"
 #include 
 #include 
+#include 
 
 QStringList vendorList;
 QHash productList;
@@ -10,6 +12,24 @@ static QHash mobileProductList;	// BT, BLE or FTDI support
 QMap descriptorLookup;
 ConnectionListModel connectionListModel;
 
+QThreadStorage errorList;
+
+#define VA_BUF(b, fmt) do { va_list args; va_start(args, fmt); put_vformat(b, fmt, args); va_end(args); } while (0)
+
+extern "C"
+int report_error_thread(const char *fmt, ...)
+{
+	struct membuffer buf = { 0 };
+
+	/* Previous unprinted errors? Add a newline in between */
+	VA_BUF(, fmt);
+	mb_cstring();
+
+	if(buf.len)
+		errorList.localData().append(QString(buf.buffer));
+	return -1;
+}
+
 static QString str_error(const char *fmt, ...)
 {
 	va_list args;
@@ -27,6 +47,7 @@ DownloadThread::DownloadThread()
 
 void DownloadThread::run()
 {
+	errorList.setLocalData(QStringList());
 	auto internalData = m_data->internalData();
 	internalData->descriptor = descriptorLookup[m_data->vendor() + m_data->product()];
 	internalData->download_table = 	
@@ -51,6 +72,8 @@ void DownloadThread::run()
 	if (errorText)
 		error = str_error(errorText, internalData->devname, internalData->vendor, internalData->product);
 
+	errors = std::move(errorList.localData());
+
 	qDebug() << "Finishing the thread" << errorText << "dives downloaded" << downloadTable.nr;
 	auto dcs = SettingsObjectWrapper::instance()->dive_computer_settings;
 	dcs->setVendor(internalData->vendor);
diff --git a/core/downloadfromdcthread.h b/core/downloadfromdcthread.h
index ff4b8f39..2dd134ff 100644
--- a/core/downloadfromdcthread.h
+++ b/core/downloadfromdcthread.h
@@ -79,6 +79,7 @@ public:
 
 	Q_INVOKABLE DCDeviceData *data();
 	QString error;
+	QStringList errors;
 
 private:
 	DCDeviceData *m_data;
@@ -101,4 +102,10 @@ extern QStringList vendorList;
 extern QHash productList;
 extern QMap descriptorLookup;
 extern ConnectionListModel connectionListModel;
+
+// Dirty hack: report_error(...) crashed when not called from main thread.
+// Thus, replace report_error(...) calls by report_error_thread(...)
+// calls, which collect error messages in thread local storage.
+extern "C" int report_error_thread(const char *fmt, ...);
+
 #endif
diff --git a/core/libdivecomputer.c b/core/libdivecomputer.c
index d8fe54e8..86456751 100644
--- a/core/libdivecomputer.c
+++ b/core/libdivecomputer.c
@@ -18,6 +18,8 @@
 #include "libdivecomputer.h"
 #include "core/version.h"
 
+int report_error_thread(const char *fmt, ...);
+
 #if !defined(SSRF_LIBDC_VERSION) || SSRF_LIBDC_VERSION < 2
 #pragma message "Subsurface requires a reasonably current version of the Subsurface-branch"
 #pragma message "of libdivecomputer (at least version 2 of our API)."
@@ -106,10 +108,10 @@ static int parse_gasmixes(device_data_t *devdata, struct dive *dive, dc_parser_t
 	if (rc == DC_STATUS_SUCCESS) {
 		if (ntanks && ntanks < ngases) {
 			shown_warning = true;
-			report_error("Warning: different number of gases (%d) and cylinders (%d)", ngases, ntanks);
+			report_error_thread("Warning: different number of gases (%d) and cylinders (%d)", ngases, ntanks);
 		} else if (ntanks > ngases) {
 			shown_warning = true;
-			report_error("Warning: 

Re: Thread related crash on failed dive computer download

2017-10-31 Thread Dirk Hohndel

> On Oct 31, 2017, at 4:43 AM, Jan Mulder  wrote:
> 
> On 31-10-17 12:05, Jan Mulder wrote:
> 
>> But Dirk has to worry, as it suspiciously close to the new error handling 
>> code that has just landed in master.
> 
> for Dirk:
> 
> from Qt 5.9.2 from build from source with debug:
> 
> failed to connect to the controller  00:80:25:4A:0F:C2 with error ""
> QObject: Cannot create children for a parent that is in a different thread.
> (Parent is QLabel(0x562909c0), parent's thread is 
> QThread(0x55ea2cc0), current thread is DownloadThread(0x7fffcbc0)
> ASSERT failure in QCoreApplication::sendEvent: "Cannot send events to objects 
> owned by a different thread. Current thread 0x0x7fffcbc0. Receiver '' (of 
> type 'QToolButton') was created in thread 0x0x55ea2cc0", file 
> kernel/qcoreapplication.cpp, line 563
> 
> Sounds like a fundamental problem, with the new way of error reporting ... 
> but not sure ...

Yes, that makes sense. Doing the showNotification there might fail. Instead we 
should just let the UI thread know that there is a new message to display. This 
may prevent 4.7.2 today as I won't have time to look at this until late this 
evening and then we might want to test it :-)

Unless one of you can look at it while I'm on an airplane for the next two hours

/D
___
subsurface mailing list
subsurface@subsurface-divelog.org
http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface


Re: Thread related crash on failed dive computer download

2017-10-31 Thread Jan Mulder

On 31-10-17 12:05, Jan Mulder wrote:

But Dirk has to worry, as it suspiciously close to the new error 
handling code that has just landed in master.


for Dirk:

from Qt 5.9.2 from build from source with debug:

failed to connect to the controller  00:80:25:4A:0F:C2 with error ""
QObject: Cannot create children for a parent that is in a different thread.
(Parent is QLabel(0x562909c0), parent's thread is 
QThread(0x55ea2cc0), current thread is DownloadThread(0x7fffcbc0)
ASSERT failure in QCoreApplication::sendEvent: "Cannot send events to 
objects owned by a different thread. Current thread 0x0x7fffcbc0. 
Receiver '' (of type 'QToolButton') was created in thread 
0x0x55ea2cc0", file kernel/qcoreapplication.cpp, line 563


Sounds like a fundamental problem, with the new way of error reporting 
... but not sure ...


As compiled with debug it trips on the assert here.

--jan


___
subsurface mailing list
subsurface@subsurface-divelog.org
http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface


Re: Thread related crash on failed dive computer download

2017-10-31 Thread Jan Mulder

Berthold,

On 31-10-17 11:53, Berthold Stoeger wrote:

Dear all,

I tried to reset my git repository to head and now I'm getting a crash which I
didn't get before. I tried "git clean -fx", but that didn't help.

The crash happens when using invalid Bluetooth addresses. Call trace:
- core/qt-ble.c l.326: report_error(...)
- core/errorhelper.c l.43: error_cb()
- desktop-widgets/mainwindow.cpp l.93: showNotification(...)

This call doesn't return but produces the following output:


The crash probably has nothing to do with the membuffer code (which itself
looks scarily thread-unsafe):
Replacing l. 93 of desktop-widgets/mainwindow.cpp by
   mainwindow->getNotificationWidget()->showNotification("test",...);
still produces the crash. But
   mainwindow->getNotificationWidget()->showNotification("",...);
does not crash.

I have a feeling that I'm doing something obviously wrong, but I don't see it.
Thank you,


Don't worry. It is not your build environment as I can easily reproduce 
your crash. BLE with incorrect MAC address.


But Dirk has to worry, as it suspiciously close to the new error 
handling code that has just landed in master.


--jan

___
subsurface mailing list
subsurface@subsurface-divelog.org
http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface