Clement Law created COUCHDB-2693:
------------------------------------
Summary: Continual server termination on disparate documents
Key: COUCHDB-2693
URL: https://issues.apache.org/jira/browse/COUCHDB-2693
Project: CouchDB
Issue Type: Bug
Security Level: public (Regular issues)
Components: Database Core
Reporter: Clement Law
Hello,
I have a 2.9TB couch database that I'm trying to compact.
As soon as the compaction begins, I get the following error (with verbose flag
turned on):
{code}
** Generic server <0.190.0> terminating
** Last message in was {'EXIT',<0.195.0>,
{function_clause,
[{couch_compress,decompress,
[<<>>],
[{file,"couch_compress.erl"},{line,67}]},
{couch_file,pread_term,2,
[{file,"couch_file.erl"},{line,135}]},
{couch_btree,get_node,2,
[{file,"couch_btree.erl"},{line,349}]},
{couch_btree,stream_node,7,
[{file,"couch_btree.erl"},{line,623}]},
{couch_btree,stream_kp_node,7,
[{file,"couch_btree.erl"},{line,637}]},
{couch_btree,stream_kp_node,8,
[{file,"couch_btree.erl"},{line,679}]},
{couch_btree,fold,4,
[{file,"couch_btree.erl"},{line,159}]},
{couch_db_updater,copy_compact,3,
[{file,"couch_db_updater.erl"},
{line,957}]}]}}
** When Server state == {db,<0.189.0>,<0.190.0>,<0.195.0>,
<<"1431530078711688">>,<0.191.0>,<0.187.0>,
<0.193.0>,
{db_header,6,2680278230,0,
{3011448227182,
{218864945,184981357,853506954536},
119310623816},
{3011448230903,403846302,36117350534},
{2127482409329,[],20950},
0,nil,nil,1000},
2680278230,
{btree,<0.187.0>,
{3011448354457,
{218864947,184981357,853506955531},
119310624011},
#Fun<couch_db_updater.10.58444962>,
#Fun<couch_db_updater.11.58444962>,
#Fun<couch_btree.5.15886126>,
#Fun<couch_db_updater.12.58444962>,snappy},
{btree,<0.187.0>,
{3011448358335,403846304,36117350691},
#Fun<couch_db_updater.13.58444962>,
#Fun<couch_db_updater.14.58444962>,
#Fun<couch_btree.5.15886126>,
#Fun<couch_db_updater.15.58444962>,snappy},
{btree,<0.187.0>,
{2127482409329,[],20950},
#Fun<couch_btree.3.15886126>,
#Fun<couch_btree.4.15886126>,
#Fun<couch_btree.5.15886126>,nil,snappy},
2680278232,<<"ddb">>,"/var/lib/couchdb/ddb.couch",
[],[],nil,
{user_ctx,null,[],undefined},
#Ref<0.0.0.684>,1000,
[before_header,after_header,on_file_open],
[{user_ctx,
{user_ctx,null,[<<"_admin">>],undefined}}],
snappy,nil,nil}
{code}
I have found that requesting specific documents seem to cause the same error to
occur.
What I'd like to do is remove these offending the documents, but I am not able
to even request a deletion via the API. The same error occurs.
What is weird is that I am able to write to the database. I am even able to
request the changes list, i.e., _changes?include_docs=true&since=<seq number>
which occasionally dies out at specific documents.
I have been using the following script to slowly produce a hopefully valid
version of the database:
{code:bash}
#!/bin/bash
END=2684050877
SEQ=0
LASTSEQ=''
LIMIT=10000000
while (( SEQ < END )) ; do
curl -s
"http://uk-couch6.yellow.sophos:5984/ddb_wz/_changes?include_docs=true&since=$SEQ&limit=$LIMIT"
2>/dev/null > couch6.dmp
SEQ=`grep -o '"seq":[0-9]*' couch6.dmp|cut -d: -f2|tail -n1`
echo "SEQ=$SEQ"
if [ "$SEQ" != "" ] ; then
cat couch6.dmp|gzip >> couch6.dmp.gz
if (( SEQ > LASTSEQ )); then
LASTSEQ=$SEQ
elif [ "$LASTSEQ" == "$SEQ" ] ; then
(( SEQ = SEQ + 1 ))
elif (( SEQ < LASTSEQ )) ; then
echo "WTF??? $SEQ < $LASTSEQ "
fi
fi
done
{code}
by restarting after every server error and incrementing the sequence number
once.
Does anyone have any ideas as to what is going on? It seem like this issue may
be related to COUCHDB-2329
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)