Re: [HACKERS] Storage Manager crash at mdwrite()
On Fri, Mar 16, 2012 at 8:34 PM, Greg Stark st...@mit.edu wrote: On Fri, Mar 16, 2012 at 11:29 PM, Jeff Davis pg...@j-davis.com wrote: There is a lot of difference between those two. In particular, it looks like the problem you are seeing is coming from the background writer, which is not running during initdb. The difference that comes to mind is that the postmaster forks. If the library opens any connections prior to forking and then uses them after forking that would work at first but it would get confused quickly once more than one backend tries to use the same connection. The data being sent would all be mixed together and they would see responses to requests other processes sent. You need to ensure that any network connections are opened up *after* the new processes are forked. It's true.. it turned out that the reason of the problem is that HDFS has problems when dealing with forked processes.. However, there's no clear suggestion on how to fix this. I attached gdb to the writer process and got the following backtrace: #0 0xb76f0430 in __kernel_vsyscall () #1 0xb6b2893d in ___newselect_nocancel () at ../sysdeps/unix/syscall-template.S:82 #2 0x0840ab46 in pg_usleep (microsec=20) at pgsleep.c:43 #3 0x0829ca9a in BgWriterNap () at bgwriter.c:642 #4 0x0829c882 in BackgroundWriterMain () at bgwriter.c:540 #5 0x0811b0ec in AuxiliaryProcessMain (argc=2, argv=0xbf982308) at bootstrap.c:417 #6 0x082a9af1 in StartChildProcess (type=BgWriterProcess) at postmaster.c:4427 #7 0x082a75de in reaper (postgres_signal_arg=17) at postmaster.c:2390 #8 signal handler called #9 0xb76f0430 in __kernel_vsyscall () #10 0xb6b2893d in ___newselect_nocancel () at ../sysdeps/unix/syscall-template.S:82 #11 0x082a5b62 in ServerLoop () at postmaster.c:1391 #12 0x082a53e2 in PostmasterMain (argc=3, argv=0xa525c28) at postmaster.c:1092 #13 0x0822dfa8 in main (argc=3, argv=0xa525c28) at main.c:188 Any ideas?
Re: [HACKERS] Storage Manager crash at mdwrite()
On Fri, Mar 16, 2012 at 8:34 PM, Greg Stark st...@mit.edu wrote: On Fri, Mar 16, 2012 at 11:29 PM, Jeff Davis pg...@j-davis.com wrote: There is a lot of difference between those two. In particular, it looks like the problem you are seeing is coming from the background writer, which is not running during initdb. The difference that comes to mind is that the postmaster forks. If the library opens any connections prior to forking and then uses them after forking that would work at first but it would get confused quickly once more than one backend tries to use the same connection. The data being sent would all be mixed together and they would see responses to requests other processes sent. The opened connections thing was the first thing I thought of.. the HDFS documentation claims the C interface is thread-safe.. However, since I noticed that different processes (having different process IDs) are calling the mdwrite, then it might be a possibility. I tried reconnecting to HDFS on every mdwrite request, but it didn't work out. You need to ensure that any network connections are opened up *after* the new processes are forked. There are other things going on that could cause problems. initdb probably doesn't deal with many errors so it might not be casing any longjmps that happen when Postgres handles errors. I suspect it doesn't create any temporary files, etc. There's also a long history of threads not playing well with Postgres. I think that's overblown and I believe they should work fine. Most of it was caused by a bug in an old version of libc. But you do have to ensure that the other threads don't call any Postgres functions because the Postgres code is very much not thread-aware. -- greg I'm novice in PG, and if I got this one running then I'll have achieved what I wanted to do without further digging in PG. So I was thinking if there was a configuration option (or something similar) that will eliminate (or reduce) the role the background writer process.. I believe it can be one of the WAL options but I'm not sure.. Any suggestions?
Re: [HACKERS] Storage Manager crash at mdwrite()
On Thu, Mar 15, 2012 at 7:58 PM, Jeff Davis pg...@j-davis.com wrote: On Thu, 2012-03-15 at 19:36 -0400, Tareq Aljabban wrote: When configuring postgreSQL, I'm adding the libraries needed to run HDFS C API (libhdfs). From the information below, it looks like C++. ./configure --prefix=/diskless/myUser/Workspace/EclipseWS1/pgsql --enable-depend --enable-cassert --enable-debug CFLAGS=$CFLAGS -I/diskless/myUser/Workspace/HDFS_Append/hdfs/src/c++/libhdfs -I/usr/lib/jvm/default-java/include LDFLAGS=$LDFLAGS -L/diskless/myUser/Workspace/HDFS_Append/hdfs/src/c++/libhdfs -L/diskless/myUser/Workspace/HDFS_Append/build/c++/Linux-i386-32/lib -L/usr/lib/jvm/default-java/jre/lib/i386/server -ljvm -lhdfs I have done lots of changes so far on how the storage manager works. In fact, I changed smgr.c so instead of calling regular md.c functions, that it would call my functions . For simplicity, you can say that whenever mdwrite() was supposed to be called, another function is also called beside it. so now what is being called is: mdwrite(param1, param2, buffer, ); hdfswrite(param1, param2, buffer, ); where hdfswrite() is the function where I write the buffer to HDFS. I changed hdfswrite() so it will always write only the same (dummy) buffer down to HDFS storage. Let's say dummyBufferFile. After writing this file 3 times to HDFS, I'm getting the message that I've shown in my first email. The same hdfswrite() code works without any issues when I run it in a separate application. Hope it's clear now. Well, it's clear that there's a lot going on ;) In other words, is there a reason you think that these would all work together nicely? I don't know the specifics, but I understand there are numerous perils to linking complex C++ code into complex C code, particularly around exception handling. Look in the archives for previous discussions around C++. Thanks for your response Jeff.. It's true that libhdfs code resides under the c++ folder, but in all of the documentation, libhdfs is referred to as the C interface of HDFS. Now what you're saying makes sense.. that nothing guarantees this will work well.. but in the phase of initDB, the function calls are done nicely without any exceptions.. when starting the postmaster, something wrong happens after 3 calls to libhdfs.. that's what I'm confused about.. What it's the difference between the two processes (initDB, start postmaster), that might cause this crash to happen? Regards
Re: [HACKERS] Storage Manager crash at mdwrite()
On Fri, 2012-03-16 at 18:02 -0400, Tareq Aljabban wrote: Thanks for your response Jeff.. It's true that libhdfs code resides under the c++ folder, but in all of the documentation, libhdfs is referred to as the C interface of HDFS. Now what you're saying makes sense.. that nothing guarantees this will work well.. but in the phase of initDB, the function calls are done nicely without any exceptions.. when starting the postmaster, something wrong happens after 3 calls to libhdfs.. that's what I'm confused about.. What it's the difference between the two processes (initDB, start postmaster), that might cause this crash to happen? There is a lot of difference between those two. In particular, it looks like the problem you are seeing is coming from the background writer, which is not running during initdb. I think the next step would be for you to compile in debug mode with -O0, and attach to the bgwriter process with gdb, and step through it. Then, you can see the exact path which causes the bgwriter to exit, and that will give you a better idea where the problem is. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Storage Manager crash at mdwrite()
On Fri, Mar 16, 2012 at 11:29 PM, Jeff Davis pg...@j-davis.com wrote: There is a lot of difference between those two. In particular, it looks like the problem you are seeing is coming from the background writer, which is not running during initdb. The difference that comes to mind is that the postmaster forks. If the library opens any connections prior to forking and then uses them after forking that would work at first but it would get confused quickly once more than one backend tries to use the same connection. The data being sent would all be mixed together and they would see responses to requests other processes sent. You need to ensure that any network connections are opened up *after* the new processes are forked. There are other things going on that could cause problems. initdb probably doesn't deal with many errors so it might not be casing any longjmps that happen when Postgres handles errors. I suspect it doesn't create any temporary files, etc. There's also a long history of threads not playing well with Postgres. I think that's overblown and I believe they should work fine. Most of it was caused by a bug in an old version of libc. But you do have to ensure that the other threads don't call any Postgres functions because the Postgres code is very much not thread-aware. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Storage Manager crash at mdwrite()
On Thu, 2012-03-15 at 13:49 -0400, Tareq Aljabban wrote: I'm implementing an extention to mdwrite() at backend/storage/smgr/md.c When a block is written to the local storage using mdwrite(), I'm sending this block to an HDFS storage. So far I don't need to read back the values I'm writing to HDFS. This approach is working fine in the initDB phase. However, when I'm running postgres (bin/pg_ctl start), the first few write operations run successfully, and then suddenly (after writing exactly 3 files to HDFS), I get a 130 exit code with the following message showing the JVM thread dump of HDFS: LOG: background writer process (PID 29347) exited with exit code 130 LOG: terminating any other active server processes ... This seems like an HDFS issue, but the same code worked properly in initDB(). I replaced this HDFS write code with a code that writes always the same block (empty one) to HDFS regardless from the value received by mdwrite().. Kept getting the same issue after writing 3 files. I copied this exact code to a separate C application and ran it there and it executed without any problems (I wrote/deleted 100 files). That's why I'm doubting that it's something related to postgreSQL. Any ideas on what's going wrong? What code, exactly, did you change in md.c, and anywhere else in postgres? Are you linking in new libraries/code from somewhere into the postgres backend? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Storage Manager crash at mdwrite()
When configuring postgreSQL, I'm adding the libraries needed to run HDFS C API (libhdfs). ./configure --prefix=/diskless/myUser/Workspace/EclipseWS1/pgsql --enable-depend --enable-cassert --enable-debug CFLAGS=$CFLAGS -I/diskless/myUser/Workspace/HDFS_Append/hdfs/src/c++/libhdfs -I/usr/lib/jvm/default-java/include LDFLAGS=$LDFLAGS -L/diskless/myUser/Workspace/HDFS_Append/hdfs/src/c++/libhdfs -L/diskless/myUser/Workspace/HDFS_Append/build/c++/Linux-i386-32/lib -L/usr/lib/jvm/default-java/jre/lib/i386/server -ljvm -lhdfs I have done lots of changes so far on how the storage manager works. In fact, I changed smgr.c so instead of calling regular md.c functions, that it would call my functions . For simplicity, you can say that whenever mdwrite() was supposed to be called, another function is also called beside it. so now what is being called is: mdwrite(param1, param2, buffer, ); hdfswrite(param1, param2, buffer, ); where hdfswrite() is the function where I write the buffer to HDFS. I changed hdfswrite() so it will always write only the same (dummy) buffer down to HDFS storage. Let's say dummyBufferFile. After writing this file 3 times to HDFS, I'm getting the message that I've shown in my first email. The same hdfswrite() code works without any issues when I run it in a separate application. Hope it's clear now. On Thu, Mar 15, 2012 at 5:28 PM, Jeff Davis pg...@j-davis.com wrote: On Thu, 2012-03-15 at 13:49 -0400, Tareq Aljabban wrote: I'm implementing an extention to mdwrite() at backend/storage/smgr/md.c When a block is written to the local storage using mdwrite(), I'm sending this block to an HDFS storage. So far I don't need to read back the values I'm writing to HDFS. This approach is working fine in the initDB phase. However, when I'm running postgres (bin/pg_ctl start), the first few write operations run successfully, and then suddenly (after writing exactly 3 files to HDFS), I get a 130 exit code with the following message showing the JVM thread dump of HDFS: LOG: background writer process (PID 29347) exited with exit code 130 LOG: terminating any other active server processes ... This seems like an HDFS issue, but the same code worked properly in initDB(). I replaced this HDFS write code with a code that writes always the same block (empty one) to HDFS regardless from the value received by mdwrite().. Kept getting the same issue after writing 3 files. I copied this exact code to a separate C application and ran it there and it executed without any problems (I wrote/deleted 100 files). That's why I'm doubting that it's something related to postgreSQL. Any ideas on what's going wrong? What code, exactly, did you change in md.c, and anywhere else in postgres? Are you linking in new libraries/code from somewhere into the postgres backend? Regards, Jeff Davis
Re: [HACKERS] Storage Manager crash at mdwrite()
On Thu, 2012-03-15 at 19:36 -0400, Tareq Aljabban wrote: When configuring postgreSQL, I'm adding the libraries needed to run HDFS C API (libhdfs). From the information below, it looks like C++. ./configure --prefix=/diskless/myUser/Workspace/EclipseWS1/pgsql --enable-depend --enable-cassert --enable-debug CFLAGS=$CFLAGS -I/diskless/myUser/Workspace/HDFS_Append/hdfs/src/c++/libhdfs -I/usr/lib/jvm/default-java/include LDFLAGS=$LDFLAGS -L/diskless/myUser/Workspace/HDFS_Append/hdfs/src/c++/libhdfs -L/diskless/myUser/Workspace/HDFS_Append/build/c++/Linux-i386-32/lib -L/usr/lib/jvm/default-java/jre/lib/i386/server -ljvm -lhdfs I have done lots of changes so far on how the storage manager works. In fact, I changed smgr.c so instead of calling regular md.c functions, that it would call my functions . For simplicity, you can say that whenever mdwrite() was supposed to be called, another function is also called beside it. so now what is being called is: mdwrite(param1, param2, buffer, ); hdfswrite(param1, param2, buffer, ); where hdfswrite() is the function where I write the buffer to HDFS. I changed hdfswrite() so it will always write only the same (dummy) buffer down to HDFS storage. Let's say dummyBufferFile. After writing this file 3 times to HDFS, I'm getting the message that I've shown in my first email. The same hdfswrite() code works without any issues when I run it in a separate application. Hope it's clear now. Well, it's clear that there's a lot going on ;) In other words, is there a reason you think that these would all work together nicely? I don't know the specifics, but I understand there are numerous perils to linking complex C++ code into complex C code, particularly around exception handling. Look in the archives for previous discussions around C++. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers