On Fri, Mar 16, 2012 at 8:34 PM, Greg Stark <st...@mit.edu> wrote:

> On Fri, Mar 16, 2012 at 11:29 PM, Jeff Davis <pg...@j-davis.com> wrote:
> > There is a lot of difference between those two. In particular, it looks
> > like the problem you are seeing is coming from the background writer,
> > which is not running during initdb.
> The difference that comes to mind is that the postmaster forks. If the
> library opens any connections prior to forking and then uses them
> after forking that would work at first but it would get confused
> quickly once more than one backend tries to use the same connection.
> The data being sent would all be mixed together and they would see
> responses to requests other processes sent.
> You need to ensure that any network connections are opened up *after*
> the new processes are forked.

It's true.. it turned out that the reason of the problem is that HDFS has
problems when dealing with forked processes.. However, there's no clear
suggestion on how to fix this.
I attached gdb to the writer process and got the following backtrace:

#0  0xb76f0430 in __kernel_vsyscall ()
#1  0xb6b2893d in ___newselect_nocancel () at
#2  0x0840ab46 in pg_usleep (microsec=200000) at pgsleep.c:43
#3  0x0829ca9a in BgWriterNap () at bgwriter.c:642
#4  0x0829c882 in BackgroundWriterMain () at bgwriter.c:540
#5  0x0811b0ec in AuxiliaryProcessMain (argc=2, argv=0xbf982308) at
#6  0x082a9af1 in StartChildProcess (type=BgWriterProcess) at
#7  0x082a75de in reaper (postgres_signal_arg=17) at postmaster.c:2390
#8  <signal handler called>
#9  0xb76f0430 in __kernel_vsyscall ()
#10 0xb6b2893d in ___newselect_nocancel () at
#11 0x082a5b62 in ServerLoop () at postmaster.c:1391
#12 0x082a53e2 in PostmasterMain (argc=3, argv=0xa525c28) at
#13 0x0822dfa8 in main (argc=3, argv=0xa525c28) at main.c:188

Any ideas?

Reply via email to