Re: [HACKERS] Storage Manager crash at mdwrite()

2012-03-27 Thread Tareq Aljabban
On Fri, Mar 16, 2012 at 8:34 PM, Greg Stark st...@mit.edu wrote:

 On Fri, Mar 16, 2012 at 11:29 PM, Jeff Davis pg...@j-davis.com wrote:
  There is a lot of difference between those two. In particular, it looks
  like the problem you are seeing is coming from the background writer,
  which is not running during initdb.

 The difference that comes to mind is that the postmaster forks. If the
 library opens any connections prior to forking and then uses them
 after forking that would work at first but it would get confused
 quickly once more than one backend tries to use the same connection.
 The data being sent would all be mixed together and they would see
 responses to requests other processes sent.
 You need to ensure that any network connections are opened up *after*
 the new processes are forked.


It's true.. it turned out that the reason of the problem is that HDFS has
problems when dealing with forked processes.. However, there's no clear
suggestion on how to fix this.
I attached gdb to the writer process and got the following backtrace:

#0  0xb76f0430 in __kernel_vsyscall ()
#1  0xb6b2893d in ___newselect_nocancel () at
../sysdeps/unix/syscall-template.S:82
#2  0x0840ab46 in pg_usleep (microsec=20) at pgsleep.c:43
#3  0x0829ca9a in BgWriterNap () at bgwriter.c:642
#4  0x0829c882 in BackgroundWriterMain () at bgwriter.c:540
#5  0x0811b0ec in AuxiliaryProcessMain (argc=2, argv=0xbf982308) at
bootstrap.c:417
#6  0x082a9af1 in StartChildProcess (type=BgWriterProcess) at
postmaster.c:4427
#7  0x082a75de in reaper (postgres_signal_arg=17) at postmaster.c:2390
#8  signal handler called
#9  0xb76f0430 in __kernel_vsyscall ()
#10 0xb6b2893d in ___newselect_nocancel () at
../sysdeps/unix/syscall-template.S:82
#11 0x082a5b62 in ServerLoop () at postmaster.c:1391
#12 0x082a53e2 in PostmasterMain (argc=3, argv=0xa525c28) at
postmaster.c:1092
#13 0x0822dfa8 in main (argc=3, argv=0xa525c28) at main.c:188

Any ideas?


Re: [HACKERS] Storage Manager crash at mdwrite()

2012-03-19 Thread Tareq Aljabban
On Fri, Mar 16, 2012 at 8:34 PM, Greg Stark st...@mit.edu wrote:

 On Fri, Mar 16, 2012 at 11:29 PM, Jeff Davis pg...@j-davis.com wrote:
  There is a lot of difference between those two. In particular, it looks
  like the problem you are seeing is coming from the background writer,
  which is not running during initdb.

 The difference that comes to mind is that the postmaster forks. If the
 library opens any connections prior to forking and then uses them
 after forking that would work at first but it would get confused
 quickly once more than one backend tries to use the same connection.
 The data being sent would all be mixed together and they would see
 responses to requests other processes sent.


The opened connections thing was the first thing I thought of.. the HDFS
documentation claims the C interface is thread-safe..
However, since I noticed that different processes (having different process
IDs) are calling the mdwrite, then it might be a possibility.
I tried reconnecting to HDFS on every mdwrite request, but it didn't work
out.



 You need to ensure that any network connections are opened up *after*
 the new processes are forked.

 There are other things going on that could cause problems. initdb
 probably doesn't deal with many errors so it might not be casing any
 longjmps that happen when Postgres handles errors. I suspect it
 doesn't create any temporary files, etc.

 There's also a long history of threads not playing well with Postgres.
 I think that's overblown and I believe they should work fine. Most of
 it was caused by a bug in an old version of libc. But you do have to
 ensure that the other threads don't call any Postgres functions
 because the Postgres code is very much not thread-aware.

 --
 greg



I'm novice in PG, and if I got this one running then I'll have achieved
what I wanted to do without further digging in PG. So I was thinking if
there was a configuration option (or something similar) that will eliminate
(or reduce) the role the background writer process.. I believe it can be
one of the WAL options but I'm not sure.. Any suggestions?


Re: [HACKERS] Storage Manager crash at mdwrite()

2012-03-16 Thread Tareq Aljabban
On Thu, Mar 15, 2012 at 7:58 PM, Jeff Davis pg...@j-davis.com wrote:

 On Thu, 2012-03-15 at 19:36 -0400, Tareq Aljabban wrote:
  When configuring postgreSQL, I'm adding the libraries needed to run
  HDFS C API (libhdfs).
 

 From the information below, it looks like C++.

 
   ./configure --prefix=/diskless/myUser/Workspace/EclipseWS1/pgsql
  --enable-depend --enable-cassert --enable-debug CFLAGS=$CFLAGS
  -I/diskless/myUser/Workspace/HDFS_Append/hdfs/src/c++/libhdfs
  -I/usr/lib/jvm/default-java/include LDFLAGS=$LDFLAGS
  -L/diskless/myUser/Workspace/HDFS_Append/hdfs/src/c++/libhdfs
  -L/diskless/myUser/Workspace/HDFS_Append/build/c++/Linux-i386-32/lib
  -L/usr/lib/jvm/default-java/jre/lib/i386/server -ljvm -lhdfs
 
 
 
 
 
  I have done lots of changes so far on how the storage manager works.
  In fact, I changed smgr.c so instead of calling regular md.c
  functions, that it would call my functions .
  For simplicity, you can say that whenever mdwrite() was supposed to be
  called, another function is also called beside it. so now what is
  being called is:
  mdwrite(param1, param2, buffer, );
  hdfswrite(param1, param2, buffer, );
 
 
  where hdfswrite() is the function where I write the buffer to HDFS.
  I changed hdfswrite() so it will always write only the same (dummy)
  buffer down to HDFS storage. Let's say dummyBufferFile. After
  writing this file 3 times to HDFS, I'm getting the message that I've
  shown in my first email.
  The same hdfswrite() code works without any issues when I run it in a
  separate application.
 
 
  Hope it's clear now.

 Well, it's clear that there's a lot going on ;)

 In other words, is there a reason you think that these would all work
 together nicely?

 I don't know the specifics, but I understand there are numerous perils
 to linking complex C++ code into complex C code, particularly around
 exception handling. Look in the archives for previous discussions around
 C++.


Thanks for your response Jeff..
It's true that libhdfs code resides under the c++ folder, but in all of the
documentation, libhdfs is referred to as the C interface of HDFS.
Now what you're saying makes sense.. that nothing guarantees this will work
well.. but in the phase of initDB, the function calls are done nicely
without any exceptions.. when starting the postmaster, something wrong
happens after 3 calls to libhdfs.. that's what I'm confused about..
What it's the difference between the two processes (initDB, start
postmaster), that might cause this crash to happen?

Regards





Re: [HACKERS] Storage Manager crash at mdwrite()

2012-03-16 Thread Jeff Davis
On Fri, 2012-03-16 at 18:02 -0400, Tareq Aljabban wrote:
 Thanks for your response Jeff.. 
 It's true that libhdfs code resides under the c++ folder, but in all
 of the documentation, libhdfs is referred to as the C interface of
 HDFS.
 Now what you're saying makes sense.. that nothing guarantees this will
 work well.. but in the phase of initDB, the function calls are done
 nicely without any exceptions.. when starting the postmaster,
 something wrong happens after 3 calls to libhdfs.. that's what I'm
 confused about..
 What it's the difference between the two processes (initDB, start
 postmaster), that might cause this crash to happen?

There is a lot of difference between those two. In particular, it looks
like the problem you are seeing is coming from the background writer,
which is not running during initdb.

I think the next step would be for you to compile in debug mode with
-O0, and attach to the bgwriter process with gdb, and step through it.
Then, you can see the exact path which causes the bgwriter to exit, and
that will give you a better idea where the problem is.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storage Manager crash at mdwrite()

2012-03-16 Thread Greg Stark
On Fri, Mar 16, 2012 at 11:29 PM, Jeff Davis pg...@j-davis.com wrote:
 There is a lot of difference between those two. In particular, it looks
 like the problem you are seeing is coming from the background writer,
 which is not running during initdb.

The difference that comes to mind is that the postmaster forks. If the
library opens any connections prior to forking and then uses them
after forking that would work at first but it would get confused
quickly once more than one backend tries to use the same connection.
The data being sent would all be mixed together and they would see
responses to requests other processes sent.

You need to ensure that any network connections are opened up *after*
the new processes are forked.

There are other things going on that could cause problems. initdb
probably doesn't deal with many errors so it might not be casing any
longjmps that happen when Postgres handles errors. I suspect it
doesn't create any temporary files, etc.

There's also a long history of threads not playing well with Postgres.
I think that's overblown and I believe they should work fine. Most of
it was caused by a bug in an old version of libc. But you do have to
ensure that the other threads don't call any Postgres functions
because the Postgres code is very much not thread-aware.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Storage Manager crash at mdwrite()

2012-03-15 Thread Tareq Aljabban
I'm implementing an extention to mdwrite() at backend/storage/smgr/md.c
When a block is written to the local storage using mdwrite(), I'm sending
this block to an HDFS storage.
So far I don't need to read back the values I'm writing to HDFS. This
approach is working fine in the initDB phase.
However, when I'm running postgres (bin/pg_ctl start), the first few write
operations run successfully, and then suddenly (after writing exactly 3
files to HDFS), I get a 130 exit code with the following message showing
the JVM thread dump of HDFS:

LOG:  background writer process (PID 29347) exited with exit code 130
LOG:  terminating any other active server processes
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
2012-03-15 13:27:52
Full thread dump OpenJDK Server VM (16.0-b13 mixed mode):

IPC Client (47) connection to localhost/127.0.0.1:8020 from taljab1
daemon prio=10 tid=0x8994d400 nid=0x72e4 in Object.wait() [0x898ad000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
 - waiting on 0x8a3c3050 (a org.apache.hadoop.ipc.Client$Connection)
at org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:403)
 - locked 0x8a3c3050 (a org.apache.hadoop.ipc.Client$Connection)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:445)

IPC Client (47) connection to localhost/127.0.0.1:8020 from taljab1
daemon prio=10 tid=0x89b87c00 nid=0x72e3 in Object.wait() [0x898fe000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
 - waiting on 0x8a2ff268 (a org.apache.hadoop.ipc.Client$Connection)
at org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:403)
 - locked 0x8a2ff268 (a org.apache.hadoop.ipc.Client$Connection)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:445)

Low Memory Detector daemon prio=10 tid=0x09daa400 nid=0x72cd runnable
[0x]
   java.lang.Thread.State: RUNNABLE

CompilerThread1 daemon prio=10 tid=0x09da8400 nid=0x72cc waiting on
condition [0x]
   java.lang.Thread.State: RUNNABLE

CompilerThread0 daemon prio=10 tid=0x09da6000 nid=0x72cb waiting on
condition [0x]
   java.lang.Thread.State: RUNNABLE

Signal Dispatcher daemon prio=10 tid=0x09da4800 nid=0x72ca waiting on
condition [0x]
   java.lang.Thread.State: RUNNABLE

Finalizer daemon prio=10 tid=0x09d94800 nid=0x72c9 in Object.wait()
[0x89db4000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
 - waiting on 0x8a7202b0 (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133)
 - locked 0x8a7202b0 (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149)
 at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)
This shows the JVM thread dump of HDFS.
This shows the JVM thread dump of HDFS.

Reference Handler daemon prio=10 tid=0x09d8fc00 nid=0x72c8 in
Object.wait() [0x89e05000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
 - waiting on 0x8a720338 (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
 at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
- locked 0x8a720338 (a java.lang.ref.Reference$Lock)

main prio=10 tid=0x09d3bc00 nid=0x72c6 runnable [0x]
   java.lang.Thread.State: RUNNABLE

VM Thread prio=10 tid=0x09d8d000 nid=0x72c7 runnable

VM Periodic Task Thread prio=10 tid=0x09dac800 nid=0x72ce waiting on
condition

JNI global references: 763

Heap
 def new generation   total 4800K, used 1844K [0x8a27, 0x8a7a,
0x9487)
  eden space 4288K,  34% used [0x8a27, 0x8a3e0418, 0x8a6a)
  from space 512K,  72% used [0x8a72, 0x8a77cdd8, 0x8a7a)
  to   space 512K,   0% used [0x8a6a, 0x8a6a, 0x8a72)
 tenured generation   total 10624K, used 0K [0x9487, 0x952d,
0xa947)
   the space 10624K,   0% used [0x9487, 0x9487, 0x94870200,
0x952d)
 compacting perm gen  total 16384K, used 5765K [0xa947, 0xaa47,
0xb147)
   the space 16384K,  35% used [0xa947, 0xa9a11480, 0xa9a11600,
0xaa47)
No shared spaces configured.

This seems like an HDFS issue, but the same code worked properly in
initDB(). I replaced this HDFS write code with a code that writes always
the same block (empty one) to HDFS regardless from the value received by
mdwrite().. Kept getting the same issue after writing 3 files.
I copied this exact code to a separate C application and ran it there and
it executed without any problems (I wrote/deleted 100 files). That's why
I'm doubting that it's something related to postgreSQL.


Re: [HACKERS] Storage Manager crash at mdwrite()

2012-03-15 Thread Jeff Davis
On Thu, 2012-03-15 at 13:49 -0400, Tareq Aljabban wrote:
 I'm implementing an extention to mdwrite() at
 backend/storage/smgr/md.c
 When a block is written to the local storage using mdwrite(), I'm
 sending this block to an HDFS storage. 
 So far I don't need to read back the values I'm writing to HDFS. This
 approach is working fine in the initDB phase.
 However, when I'm running postgres (bin/pg_ctl start), the first few
 write operations run successfully, and then suddenly (after writing
 exactly 3 files to HDFS), I get a 130 exit code with the following
 message showing the JVM thread dump of HDFS:
 
 
 LOG:  background writer process (PID 29347) exited with exit code 130
 LOG:  terminating any other active server processes

...

 This seems like an HDFS issue, but the same code worked properly in
 initDB(). I replaced this HDFS write code with a code that writes
 always the same block (empty one) to HDFS regardless from the value
 received by mdwrite().. Kept getting the same issue after writing 3
 files.
 I copied this exact code to a separate C application and ran it there
 and it executed without any problems (I wrote/deleted 100 files).
 That's why I'm doubting that it's something related to postgreSQL.
 
 
 Any ideas on what's going wrong?

What code, exactly, did you change in md.c, and anywhere else in
postgres? Are you linking in new libraries/code from somewhere into the
postgres backend?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storage Manager crash at mdwrite()

2012-03-15 Thread Tareq Aljabban
When configuring postgreSQL, I'm adding the libraries needed to run HDFS C
API (libhdfs).


 ./configure --prefix=/diskless/myUser/Workspace/EclipseWS1/pgsql
--enable-depend --enable-cassert --enable-debug CFLAGS=$CFLAGS
-I/diskless/myUser/Workspace/HDFS_Append/hdfs/src/c++/libhdfs
-I/usr/lib/jvm/default-java/include LDFLAGS=$LDFLAGS
-L/diskless/myUser/Workspace/HDFS_Append/hdfs/src/c++/libhdfs
-L/diskless/myUser/Workspace/HDFS_Append/build/c++/Linux-i386-32/lib
-L/usr/lib/jvm/default-java/jre/lib/i386/server -ljvm -lhdfs


I have done lots of changes so far on how the storage manager works. In
fact, I changed smgr.c so instead of calling regular md.c functions, that
it would call my functions .
For simplicity, you can say that whenever mdwrite() was supposed to be
called, another function is also called beside it. so now what is being
called is:
mdwrite(param1, param2, buffer, );
hdfswrite(param1, param2, buffer, );

where hdfswrite() is the function where I write the buffer to HDFS.
I changed hdfswrite() so it will always write only the same (dummy) buffer
down to HDFS storage. Let's say dummyBufferFile. After writing this file
3 times to HDFS, I'm getting the message that I've shown in my first email.
The same hdfswrite() code works without any issues when I run it in a
separate application.

Hope it's clear now.







On Thu, Mar 15, 2012 at 5:28 PM, Jeff Davis pg...@j-davis.com wrote:

 On Thu, 2012-03-15 at 13:49 -0400, Tareq Aljabban wrote:
  I'm implementing an extention to mdwrite() at
  backend/storage/smgr/md.c
  When a block is written to the local storage using mdwrite(), I'm
  sending this block to an HDFS storage.
  So far I don't need to read back the values I'm writing to HDFS. This
  approach is working fine in the initDB phase.
  However, when I'm running postgres (bin/pg_ctl start), the first few
  write operations run successfully, and then suddenly (after writing
  exactly 3 files to HDFS), I get a 130 exit code with the following
  message showing the JVM thread dump of HDFS:
 
 
  LOG:  background writer process (PID 29347) exited with exit code 130
  LOG:  terminating any other active server processes

 ...

  This seems like an HDFS issue, but the same code worked properly in
  initDB(). I replaced this HDFS write code with a code that writes
  always the same block (empty one) to HDFS regardless from the value
  received by mdwrite().. Kept getting the same issue after writing 3
  files.
  I copied this exact code to a separate C application and ran it there
  and it executed without any problems (I wrote/deleted 100 files).
  That's why I'm doubting that it's something related to postgreSQL.
 
 
  Any ideas on what's going wrong?

 What code, exactly, did you change in md.c, and anywhere else in
 postgres? Are you linking in new libraries/code from somewhere into the
 postgres backend?

 Regards,
 Jeff Davis




Re: [HACKERS] Storage Manager crash at mdwrite()

2012-03-15 Thread Jeff Davis
On Thu, 2012-03-15 at 19:36 -0400, Tareq Aljabban wrote:
 When configuring postgreSQL, I'm adding the libraries needed to run
 HDFS C API (libhdfs).
 

From the information below, it looks like C++.

 
  ./configure --prefix=/diskless/myUser/Workspace/EclipseWS1/pgsql
 --enable-depend --enable-cassert --enable-debug CFLAGS=$CFLAGS
 -I/diskless/myUser/Workspace/HDFS_Append/hdfs/src/c++/libhdfs
 -I/usr/lib/jvm/default-java/include LDFLAGS=$LDFLAGS
 -L/diskless/myUser/Workspace/HDFS_Append/hdfs/src/c++/libhdfs
 -L/diskless/myUser/Workspace/HDFS_Append/build/c++/Linux-i386-32/lib
 -L/usr/lib/jvm/default-java/jre/lib/i386/server -ljvm -lhdfs
 
 
 
 
 
 I have done lots of changes so far on how the storage manager works.
 In fact, I changed smgr.c so instead of calling regular md.c
 functions, that it would call my functions .
 For simplicity, you can say that whenever mdwrite() was supposed to be
 called, another function is also called beside it. so now what is
 being called is:
 mdwrite(param1, param2, buffer, );
 hdfswrite(param1, param2, buffer, );
 
 
 where hdfswrite() is the function where I write the buffer to HDFS.
 I changed hdfswrite() so it will always write only the same (dummy)
 buffer down to HDFS storage. Let's say dummyBufferFile. After
 writing this file 3 times to HDFS, I'm getting the message that I've
 shown in my first email.
 The same hdfswrite() code works without any issues when I run it in a
 separate application.
 
 
 Hope it's clear now.

Well, it's clear that there's a lot going on ;)

In other words, is there a reason you think that these would all work
together nicely?

I don't know the specifics, but I understand there are numerous perils
to linking complex C++ code into complex C code, particularly around
exception handling. Look in the archives for previous discussions around
C++.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers