Re: [BUGS] [HACKERS] Segmentation fault in libpq

2017-07-03 Thread Michal Novotny



On 07/03/2017 04:58 AM, Craig Ringer wrote:

On 3 July 2017 at 03:12, Andres Freund  wrote:

Hi,

On 2017-07-02 20:58:52 +0200, Michal Novotný wrote:

thank you all for your advice. I've been investigating this a little more
and finally it turned out it's not a bug in libpq although I got confused
by going deep as several libpq functions. The bug was really on our side
after trying to use connection pointer after calling PQfinish(). The code
is pretty complex so it took some time to investigate however I would like
to apologize for "blaming" libpq instead of our code.

Usually using a tool like valgrind is quite helpful to find issues like
that, because it'll show you the call-stack accessing the memory and
*also* the call-stack that lead to the memory being freed.

Yep, huge help.

BTW, on Windows, the free tool DrMemory (now 64-bit too, yay) or
commercial Purify work great.


Well, good to know about Windows stuff however we use Linux so that's 
not a big deal. Unfortunately it's easy to miss something in valgrind if 
you have once multi-threaded library linked to libpq and this 
multi-threaded library is used in conjunction with another libraries 
sharing some of the data among them.


Thanks once again,
Michal

--
Michal Novotny
System Development Lead
michal.novo...@greycortex.com

GREYCORTEX s.r.o.
Purkynova 127, 61200 Brno
Czech Republic
www.greycortex.com



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [BUGS] [HACKERS] Segmentation fault in libpq

2017-07-03 Thread Michal Novotny



On 07/02/2017 09:12 PM, Andres Freund wrote:

Hi,

On 2017-07-02 20:58:52 +0200, Michal Novotný wrote:

thank you all for your advice. I've been investigating this a little more
and finally it turned out it's not a bug in libpq although I got confused
by going deep as several libpq functions. The bug was really on our side
after trying to use connection pointer after calling PQfinish(). The code
is pretty complex so it took some time to investigate however I would like
to apologize for "blaming" libpq instead of our code.

Usually using a tool like valgrind is quite helpful to find issues like
that, because it'll show you the call-stack accessing the memory and
*also* the call-stack that lead to the memory being freed.

- Andres


Well, I've tried but I was unable to locate the issue so I had to 
investigate the code our little further and finally I've been able to 
find the issue.


Thanks again,
Michal


--
Michal Novotny
System Development Lead
michal.novo...@greycortex.com

GREYCORTEX s.r.o.
Purkynova 127, 61200 Brno
Czech Republic
www.greycortex.com



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [BUGS] [HACKERS] Segmentation fault in libpq

2017-07-02 Thread Craig Ringer
On 3 July 2017 at 03:12, Andres Freund  wrote:
> Hi,
>
> On 2017-07-02 20:58:52 +0200, Michal Novotný wrote:
>> thank you all for your advice. I've been investigating this a little more
>> and finally it turned out it's not a bug in libpq although I got confused
>> by going deep as several libpq functions. The bug was really on our side
>> after trying to use connection pointer after calling PQfinish(). The code
>> is pretty complex so it took some time to investigate however I would like
>> to apologize for "blaming" libpq instead of our code.
>
> Usually using a tool like valgrind is quite helpful to find issues like
> that, because it'll show you the call-stack accessing the memory and
> *also* the call-stack that lead to the memory being freed.

Yep, huge help.

BTW, on Windows, the free tool DrMemory (now 64-bit too, yay) or
commercial Purify work great.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [BUGS] [HACKERS] Segmentation fault in libpq

2017-07-02 Thread Michal Novotný
Hi all,
thank you all for your advice. I've been investigating this a little more
and finally it turned out it's not a bug in libpq although I got confused
by going deep as several libpq functions. The bug was really on our side
after trying to use connection pointer after calling PQfinish(). The code
is pretty complex so it took some time to investigate however I would like
to apologize for "blaming" libpq instead of our code.

Anyway, thank you all for valuable advice.
Have a great time,
Michal

2017-06-29 16:30 GMT+02:00 Merlin Moncure :

> On Thu, Jun 29, 2017 at 9:12 AM, Tom Lane  wrote:
> > Merlin Moncure  writes:
> >> On Thu, Jun 29, 2017 at 8:23 AM, Michal Novotny
> >>  wrote:
> >>> Could you please help me based on information provided above?
> >
> >> You might want to run your code through some analysis tools (for
> >> example, valgrind).
> >
> > valgrind is not a perfect tool for finding that kind of problem,
> > especially if you can't reproduce the crash reliably; but at least
> > valgrind is readily available and easy to use, so you might as
> > well start there and see if it finds anything.  If you have access
> > to any sort of static analysis tool (eg, Coverity), that might be
> > more likely to help.  Or you could fall back on manual code
> > auditing, if the program isn't very big.
>
> clang static analyzer is another good tool to check out
>
> https://clang-analyzer.llvm.org/
>
> merlin
>



-- 
Michal Novotny
System Development Lead
michal.novo...@greycortex.com

*GREYCORTEX s.r.o.*
Purkynova 127, 61200 Brno
Czech Republic
www.greycortex.com


Re: [BUGS] [HACKERS] Segmentation fault in libpq

2017-07-02 Thread Andres Freund
Hi,

On 2017-07-02 20:58:52 +0200, Michal Novotný wrote:
> thank you all for your advice. I've been investigating this a little more
> and finally it turned out it's not a bug in libpq although I got confused
> by going deep as several libpq functions. The bug was really on our side
> after trying to use connection pointer after calling PQfinish(). The code
> is pretty complex so it took some time to investigate however I would like
> to apologize for "blaming" libpq instead of our code.

Usually using a tool like valgrind is quite helpful to find issues like
that, because it'll show you the call-stack accessing the memory and
*also* the call-stack that lead to the memory being freed.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [BUGS] [HACKERS] Segmentation fault in libpq

2017-06-29 Thread Merlin Moncure
On Thu, Jun 29, 2017 at 9:12 AM, Tom Lane  wrote:
> Merlin Moncure  writes:
>> On Thu, Jun 29, 2017 at 8:23 AM, Michal Novotny
>>  wrote:
>>> Could you please help me based on information provided above?
>
>> You might want to run your code through some analysis tools (for
>> example, valgrind).
>
> valgrind is not a perfect tool for finding that kind of problem,
> especially if you can't reproduce the crash reliably; but at least
> valgrind is readily available and easy to use, so you might as
> well start there and see if it finds anything.  If you have access
> to any sort of static analysis tool (eg, Coverity), that might be
> more likely to help.  Or you could fall back on manual code
> auditing, if the program isn't very big.

clang static analyzer is another good tool to check out

https://clang-analyzer.llvm.org/

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [BUGS] [HACKERS] Segmentation fault in libpq

2017-06-29 Thread Tom Lane
Merlin Moncure  writes:
> On Thu, Jun 29, 2017 at 8:23 AM, Michal Novotny
>  wrote:
>> Could you please help me based on information provided above?

> You might want to run your code through some analysis tools (for
> example, valgrind).

Yeah, that's what I was about to suggest.  pqexpbuffer.c is pretty
small and paranoid code; it's really hard to see how it could have
crashed there unless something else corrupted its data structure.
While it's always possible that the "something else" was a wild
store from elsewhere in libpq, the lack of similar reports from
others and the fact that you don't sound to be doing anything very
exotic in terms of libpq requests both weigh against that theory.
If I had to bet given this much evidence, I'd bet on a wild store
from somewhere in your application having corrupted the
conn->errorMessage before PQexecParams was entered.  C is not a
language that does much to prevent that kind of bug for you.

valgrind is not a perfect tool for finding that kind of problem,
especially if you can't reproduce the crash reliably; but at least
valgrind is readily available and easy to use, so you might as
well start there and see if it finds anything.  If you have access
to any sort of static analysis tool (eg, Coverity), that might be
more likely to help.  Or you could fall back on manual code
auditing, if the program isn't very big.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Segmentation fault in libpq

2017-06-29 Thread Merlin Moncure
On Thu, Jun 29, 2017 at 8:23 AM, Michal Novotny
 wrote:
> Hi,
>
> comments inline ...
>
>
>
> On 06/29/2017 03:08 PM, Merlin Moncure wrote:
>>
>> On Thu, Jun 29, 2017 at 4:01 AM, Michal Novotny
>>  wrote:
>>>
>>> Hi all,
>>>
>>> we've developed an application using libpq to access a table in the PgSQL
>>> database but we're sometimes experiencing segmentation fault on
>>> resetPQExpBuffer() function of libpq called from PQexecParams() with
>>> prepared query.
>>>
>>> PostgreSQL version is 9.6.3 and the backtrace is:
>>>
>>> Core was generated by `/usr/ti/bin/status-monitor2 -m
>>> /usr/lib64/status-monitor2/modules'.
>>> Program terminated with signal 11, Segmentation fault.
>>> #0  resetPQExpBuffer (str=str@entry=0x9f4a28) at pqexpbuffer.c:152
>>> 152 str->data[0] = '\0';
>>>
>>> Thread 1 (Thread 0x7fdf68de3840 (LWP 3525)):
>>> #0  resetPQExpBuffer (str=str@entry=0x9f4a28) at pqexpbuffer.c:152
>>> No locals.
>>> #1  0x7fdf66e0333d in PQsendQueryStart (conn=conn@entry=0x9f46d0) at
>>> fe-exec.c:1371
>>> No locals.
>>> #2  0x7fdf66e044b9 in PQsendQueryParams (conn=conn@entry=0x9f46d0,
>>> command=command@entry=0x409a98 "SELECT min, hour, day, month, dow,
>>> sensor,
>>> module, params, priority, rt_due FROM sm.cron WHERE sensor = $1 ORDER BY
>>> priority DESC", nParams=nParams@entry=1, paramTypes=paramTypes@entry=0x0,
>>> paramValues=paramValues@entry=0xa2b7b0,
>>> paramLengths=paramLengths@entry=0x0,
>>> paramFormats=paramFormats@entry=0x0, resultFormat=resultFormat@entry=0)
>>> at
>>> fe-exec.c:1192
>>> No locals.
>>> #3  0x7fdf66e0552b in PQexecParams (conn=0x9f46d0, command=0x409a98
>>> "SELECT min, hour, day, month, dow, sensor, module, params, priority,
>>> rt_due
>>> FROM sm.cron WHERE sensor = $1 ORDER BY priority DESC", nParams=1,
>>> paramTypes=0x0, paramValues=0xa2b7b0, paramLengths=0x0, paramFormats=0x0,
>>> resultFormat=0) at fe-exec.c:1871
>>> No locals.
>>>
>>> Unfortunately we didn't have more information from the crash, at least
>>> for
>>> now.
>>>
>>> Is this a known issue and can you help me with this one?
>>
>> Is your application written in C?  We would need to completely rule
>> out your code (say, by double freeing result or something else nasty)
>> before assuming problem was withing libpq itself, particularly in this
>> area of the code.  How reproducible is the problem?
>>
>> merlin
>
>
> The application is written in plain C. The issue is it happens just
> sometimes - sometimes it happens and sometimes it doesn't.  Once it happens
> it causes the application crash but as it's systemd unit with
> Restart=on-failure flag it's automatically being restarted.
>
> What's being done is:
> 1) Ensure connection already exists and create a new one if it doesn't exist
> yet
> 2) Run PQexecParams() with specified $params that has $params_cnt elements:
>
> res = PQexecParams(conn, prepared_query, params_cnt, NULL, (const char
> **)params, NULL, NULL, 0);
>
> 3) Check for result and report error and exit if "PQresultStatus(res) !=
> PGRES_TUPLES_OK"
> 4) Do some processing with the result
> 5) Clear result using PQclear()
>
> It usually works fine but sometimes it's crashing and I don't know how to
> investigate further.
>
> Could you please help me based on information provided above?

You might want to run your code through some analysis tools (for
example, valgrind).  Short of that, to get help here you need to post
the code for review. How big is your application?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Segmentation fault in libpq

2017-06-29 Thread Michal Novotny

Hi,

comments inline ...


On 06/29/2017 03:08 PM, Merlin Moncure wrote:

On Thu, Jun 29, 2017 at 4:01 AM, Michal Novotny
 wrote:

Hi all,

we've developed an application using libpq to access a table in the PgSQL
database but we're sometimes experiencing segmentation fault on
resetPQExpBuffer() function of libpq called from PQexecParams() with
prepared query.

PostgreSQL version is 9.6.3 and the backtrace is:

Core was generated by `/usr/ti/bin/status-monitor2 -m
/usr/lib64/status-monitor2/modules'.
Program terminated with signal 11, Segmentation fault.
#0  resetPQExpBuffer (str=str@entry=0x9f4a28) at pqexpbuffer.c:152
152 str->data[0] = '\0';

Thread 1 (Thread 0x7fdf68de3840 (LWP 3525)):
#0  resetPQExpBuffer (str=str@entry=0x9f4a28) at pqexpbuffer.c:152
No locals.
#1  0x7fdf66e0333d in PQsendQueryStart (conn=conn@entry=0x9f46d0) at
fe-exec.c:1371
No locals.
#2  0x7fdf66e044b9 in PQsendQueryParams (conn=conn@entry=0x9f46d0,
command=command@entry=0x409a98 "SELECT min, hour, day, month, dow, sensor,
module, params, priority, rt_due FROM sm.cron WHERE sensor = $1 ORDER BY
priority DESC", nParams=nParams@entry=1, paramTypes=paramTypes@entry=0x0,
paramValues=paramValues@entry=0xa2b7b0, paramLengths=paramLengths@entry=0x0,
paramFormats=paramFormats@entry=0x0, resultFormat=resultFormat@entry=0) at
fe-exec.c:1192
No locals.
#3  0x7fdf66e0552b in PQexecParams (conn=0x9f46d0, command=0x409a98
"SELECT min, hour, day, month, dow, sensor, module, params, priority, rt_due
FROM sm.cron WHERE sensor = $1 ORDER BY priority DESC", nParams=1,
paramTypes=0x0, paramValues=0xa2b7b0, paramLengths=0x0, paramFormats=0x0,
resultFormat=0) at fe-exec.c:1871
No locals.

Unfortunately we didn't have more information from the crash, at least for
now.

Is this a known issue and can you help me with this one?

Is your application written in C?  We would need to completely rule
out your code (say, by double freeing result or something else nasty)
before assuming problem was withing libpq itself, particularly in this
area of the code.  How reproducible is the problem?

merlin


The application is written in plain C. The issue is it happens just 
sometimes - sometimes it happens and sometimes it doesn't.  Once it 
happens it causes the application crash but as it's systemd unit with 
Restart=on-failure flag it's automatically being restarted.


What's being done is:
1) Ensure connection already exists and create a new one if it doesn't 
exist yet

2) Run PQexecParams() with specified $params that has $params_cnt elements:

res = PQexecParams(conn, prepared_query, params_cnt, NULL, (const char 
**)params, NULL, NULL, 0);


3) Check for result and report error and exit if "PQresultStatus(res) != 
PGRES_TUPLES_OK"

4) Do some processing with the result
5) Clear result using PQclear()

It usually works fine but sometimes it's crashing and I don't know how 
to investigate further.


Could you please help me based on information provided above?

Thanks,
Michal

--
Michal Novotny
System Development Lead
michal.novo...@greycortex.com

GREYCORTEX s.r.o.
Purkynova 127, 61200 Brno
Czech Republic
www.greycortex.com



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Segmentation fault in libpq

2017-06-29 Thread Merlin Moncure
On Thu, Jun 29, 2017 at 4:01 AM, Michal Novotny
 wrote:
> Hi all,
>
> we've developed an application using libpq to access a table in the PgSQL
> database but we're sometimes experiencing segmentation fault on
> resetPQExpBuffer() function of libpq called from PQexecParams() with
> prepared query.
>
> PostgreSQL version is 9.6.3 and the backtrace is:
>
> Core was generated by `/usr/ti/bin/status-monitor2 -m
> /usr/lib64/status-monitor2/modules'.
> Program terminated with signal 11, Segmentation fault.
> #0  resetPQExpBuffer (str=str@entry=0x9f4a28) at pqexpbuffer.c:152
> 152 str->data[0] = '\0';
>
> Thread 1 (Thread 0x7fdf68de3840 (LWP 3525)):
> #0  resetPQExpBuffer (str=str@entry=0x9f4a28) at pqexpbuffer.c:152
> No locals.
> #1  0x7fdf66e0333d in PQsendQueryStart (conn=conn@entry=0x9f46d0) at
> fe-exec.c:1371
> No locals.
> #2  0x7fdf66e044b9 in PQsendQueryParams (conn=conn@entry=0x9f46d0,
> command=command@entry=0x409a98 "SELECT min, hour, day, month, dow, sensor,
> module, params, priority, rt_due FROM sm.cron WHERE sensor = $1 ORDER BY
> priority DESC", nParams=nParams@entry=1, paramTypes=paramTypes@entry=0x0,
> paramValues=paramValues@entry=0xa2b7b0, paramLengths=paramLengths@entry=0x0,
> paramFormats=paramFormats@entry=0x0, resultFormat=resultFormat@entry=0) at
> fe-exec.c:1192
> No locals.
> #3  0x7fdf66e0552b in PQexecParams (conn=0x9f46d0, command=0x409a98
> "SELECT min, hour, day, month, dow, sensor, module, params, priority, rt_due
> FROM sm.cron WHERE sensor = $1 ORDER BY priority DESC", nParams=1,
> paramTypes=0x0, paramValues=0xa2b7b0, paramLengths=0x0, paramFormats=0x0,
> resultFormat=0) at fe-exec.c:1871
> No locals.
>
> Unfortunately we didn't have more information from the crash, at least for
> now.
>
> Is this a known issue and can you help me with this one?

Is your application written in C?  We would need to completely rule
out your code (say, by double freeing result or something else nasty)
before assuming problem was withing libpq itself, particularly in this
area of the code.  How reproducible is the problem?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Segmentation fault in libpq

2017-06-29 Thread Michal Novotny

Hi all,

we've developed an application using libpq to access a table in the 
PgSQL database but we're sometimes experiencing segmentation fault on 
resetPQExpBuffer() function of libpq called from PQexecParams() with 
prepared query.


PostgreSQL version is 9.6.3 and the backtrace is:

Core was generated by `/usr/ti/bin/status-monitor2 -m 
/usr/lib64/status-monitor2/modules'.
Program terminated with signal 11, Segmentation fault.
#0  resetPQExpBuffer (str=str@entry=0x9f4a28) at pqexpbuffer.c:152
152 str->data[0] = '\0';

Thread  1 (Thread  0x7fdf68de3840 (LWP 3525)):
#0  resetPQExpBuffer (str=str@entry=0x9f4a28) at pqexpbuffer.c:152
No locals.
#1  0x7fdf66e0333d in PQsendQueryStart (conn=conn@entry=0x9f46d0) at 
fe-exec.c:1371
No locals.
#2  0x7fdf66e044b9 in PQsendQueryParams (conn=conn@entry=0x9f46d0, command=command@entry=0x409a98"SELECT min, hour, day, month, dow, sensor, module, params, priority, 
rt_due FROM sm.cron WHERE sensor = $1 ORDER BY priority DESC", nParams=nParams@entry=1, paramTypes=paramTypes@entry=0x0, paramValues=paramValues@entry=0xa2b7b0, paramLengths=paramLengths@entry=0x0, paramFormats=paramFormats@entry=0x0, resultFormat=resultFormat@entry=0) at fe-exec.c:1192

No locals.
#3  0x7fdf66e0552b in PQexecParams (conn=0x9f46d0, command=0x409a98"SELECT min, hour, day, month, dow, sensor, module, params, priority, 
rt_due FROM sm.cron WHERE sensor = $1 ORDER BY priority DESC", nParams=1, paramTypes=0x0, paramValues=0xa2b7b0, paramLengths=0x0, paramFormats=0x0, resultFormat=0) at fe-exec.c:1871

No locals.

Unfortunately we didn't have more information from the crash, at least 
for now.


Is this a known issue and can you help me with this one?

Thanks,
Michal

--
Michal Novotny
System Development Lead
michal.novo...@greycortex.com

GREYCORTEX s.r.o.
Purkynova 127, 61200 Brno
Czech Republic
www.greycortex.com