Re: FreeBSD Building and Testing
On 6-1-2016 08:51, Mykola Golub wrote: On Mon, Dec 28, 2015 at 05:53:04PM +0100, Willem Jan Withagen wrote: Hi, Can somebody try to help me and explain why in test: Func: test/mon/osd-crash Func: TEST_crush_reject_empty started Fails with a python error which sort of startles me: test/mon/osd-crush.sh:227: TEST_crush_reject_empty: local empty_map=testdir/osd-crush/empty_map test/mon/osd-crush.sh:228: TEST_crush_reject_empty: : test/mon/osd-crush.sh:229: TEST_crush_reject_empty: ./crushtool -c testdir/osd-crush/empty_map.txt -o testdir/osd-crush/empty_map.m ap test/mon/osd-crush.sh:230: TEST_crush_reject_empty: expect_failure testdir/osd-crush 'Error EINVAL' ./ceph osd setcrushmap -i testd ir/osd-crush/empty_map.map ../qa/workunits/ceph-helpers.sh:1171: expect_failure: local dir=testdir/osd-crush ../qa/workunits/ceph-helpers.sh:1172: expect_failure: shift ../qa/workunits/ceph-helpers.sh:1173: expect_failure: local 'expected=Error EINVAL' ../qa/workunits/ceph-helpers.sh:1174: expect_failure: shift ../qa/workunits/ceph-helpers.sh:1175: expect_failure: local success ../qa/workunits/ceph-helpers.sh:1176: expect_failure: pwd ../qa/workunits/ceph-helpers.sh:1177: expect_failure: printenv ../qa/workunits/ceph-helpers.sh:1178: expect_failure: echo ./ceph osd setcrushmap -i testdir/osd-crush/empty_map.map ../qa/workunits/ceph-helpers.sh:1180: expect_failure: ./ceph osd setcrushmap -i testdir/osd-crush/empty_map.map *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** Traceback (most recent call last): File "./ceph", line 936, in retval = main() File "./ceph", line 874, in main sigdict, inbuf, verbose) File "./ceph", line 457, in new_style_command inbuf=inbuf) File "/usr/srcs/Ceph/wip-freebsd-wjw/ceph/src/pybind/ceph_argparse.py", line 1208, in json_command raise RuntimeError('"{0}": exception {1}'.format(argdict, e)) RuntimeError: "{'prefix': u'osd setcrushmap'}": exception "['{"prefix": "osd setcrushmap"}']": exception 'utf8' codec can't decode b yte 0x86 in position 56: invalid start byte Which is certainly not the type of error expected. But it is hard to detect any 0x86 in the arguments. Are you able to reproduce this problem manually? I.e. in src dir, start the cluster using vstart.sh: ./vstart.sh -n Check it is running: ./ceph -s Repeat the test: truncate -s 0 empty_map.txt ./crushtool -c empty_map.txt -o empty_map.map ./ceph osd setcrushmap -i empty_map.map Expected output: "Error EINVAL: Failed crushmap test: ./crushtool: exit status: 1" Hi all, I've spent the Xmas days trying to learn more about Python. (And catching up with old friends :) ) My heritage is the days of assembler, shell script, C, Perl and likes. So the pony had to learn a few new tricks. (aka language) I'm now trying to get python nosetest to actually work In the mean time I also found that FreeBSD has patches for Googletest to actually make most of the DEATH tests work. I think this python stream pars error got resolved by upgrading everything build, including the complete package environment and upgrading kernel and tools... :) Which I think cleaned out the python environment which was a bit mixed up with different versions. Now test/mon/osd-crush.sh return OKE, so I guess the setup of the environment is relatively critical. I also noted that some of the test get more tests done IF I run them under root-priviledges The last test run resulted in: = ceph 10.0.1: src/test-suite.log = # TOTAL: 120 # PASS: 110 # SKIP: 0 # XFAIL: 0 # FAIL: 10 # XPASS: 0 # ERROR: 0 FAIL ceph-detect-init/run-tox.sh (exit status: 1) FAIL test/run-rbd-unit-tests.sh (exit status: 138) FAIL test/ceph_objectstore_tool.py (exit status: 1) FAIL test/cephtool-test-mon.sh (exit status: 1) FAIL test/cephtool-test-rados.sh (exit status: 1) FAIL test/libradosstriper/rados-striper.sh (exit status: 1) FAIL test/test_objectstore_memstore.sh (exit status: 127) FAIL test/ceph-disk.sh (exit status: 1) FAIL test/pybind/test_ceph_argparse.py (exit status: 127) FAIL test/pybind/test_ceph_daemon.py (exit status: 127) where the first and last 2 actually don't work because of python things that are not working on FreeBSD and I have to sort out. ceph_detect_init.exc.UnsupportedPlatform: Platform is not supported.: ../test-driver: ./test/pybind/test_ceph_argparse.py: not found FAIL test/pybind/test_ceph_argparse.py (exit status: 127) I also have: ./test/test_objectstore_memstore.sh: ./ceph_test_objectstore: not found FAIL test/test_objectstore_memstore.sh (exit status: 127) Which ia a weird one, that needs some TLC. So I'm slowly getting there... --WjW -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: FreeBSD Building and Testing
On 5-1-2016 19:23, Gregory Farnum wrote: On Mon, Dec 28, 2015 at 8:53 AM, Willem Jan Withagenwrote: Hi, Can somebody try to help me and explain why in test: Func: test/mon/osd-crash Func: TEST_crush_reject_empty started Fails with a python error which sort of startles me: test/mon/osd-crush.sh:227: TEST_crush_reject_empty: local empty_map=testdir/osd-crush/empty_map test/mon/osd-crush.sh:228: TEST_crush_reject_empty: : test/mon/osd-crush.sh:229: TEST_crush_reject_empty: ./crushtool -c testdir/osd-crush/empty_map.txt -o testdir/osd-crush/empty_map.m ap test/mon/osd-crush.sh:230: TEST_crush_reject_empty: expect_failure testdir/osd-crush 'Error EINVAL' ./ceph osd setcrushmap -i testd ir/osd-crush/empty_map.map ../qa/workunits/ceph-helpers.sh:1171: expect_failure: local dir=testdir/osd-crush ../qa/workunits/ceph-helpers.sh:1172: expect_failure: shift ../qa/workunits/ceph-helpers.sh:1173: expect_failure: local 'expected=Error EINVAL' ../qa/workunits/ceph-helpers.sh:1174: expect_failure: shift ../qa/workunits/ceph-helpers.sh:1175: expect_failure: local success ../qa/workunits/ceph-helpers.sh:1176: expect_failure: pwd ../qa/workunits/ceph-helpers.sh:1177: expect_failure: printenv ../qa/workunits/ceph-helpers.sh:1178: expect_failure: echo ./ceph osd setcrushmap -i testdir/osd-crush/empty_map.map ../qa/workunits/ceph-helpers.sh:1180: expect_failure: ./ceph osd setcrushmap -i testdir/osd-crush/empty_map.map *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** Traceback (most recent call last): File "./ceph", line 936, in retval = main() File "./ceph", line 874, in main sigdict, inbuf, verbose) File "./ceph", line 457, in new_style_command inbuf=inbuf) File "/usr/srcs/Ceph/wip-freebsd-wjw/ceph/src/pybind/ceph_argparse.py", line 1208, in json_command raise RuntimeError('"{0}": exception {1}'.format(argdict, e)) RuntimeError: "{'prefix': u'osd setcrushmap'}": exception "['{"prefix": "osd setcrushmap"}']": exception 'utf8' codec can't decode b yte 0x86 in position 56: invalid start byte Which is certainly not the type of error expected. But it is hard to detect any 0x86 in the arguments. And yes python is right, there are no UTF8 sequences that start with 0x86. Question is: Why does it want to parse with UTF8? And how do I switch it off? Or how to I fix this error? I've not handled this myself but we've seen this a few times. The latest example in a quick email search was http://tracker.ceph.com/issues/9405, and it was apparently having a string which wasn't null-terminated. Looks like in my case it was due to too large a mess in the python environment. But I'll keep this in my mind, IFF it comes back to haunt me more. Thanx, --WjW -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: FreeBSD Building and Testing
On 6-1-2016 08:51, Mykola Golub wrote: > > Are you able to reproduce this problem manually? I.e. in src dir, start the > cluster using vstart.sh: > > ./vstart.sh -n > > Check it is running: > > ./ceph -s > > Repeat the test: > > truncate -s 0 empty_map.txt > ./crushtool -c empty_map.txt -o empty_map.map > ./ceph osd setcrushmap -i empty_map.map > > Expected output: > > "Error EINVAL: Failed crushmap test: ./crushtool: exit status: 1" > Oke thanx Nice to have some of these examples... --WjW -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: FreeBSD Building and Testing
On Mon, Dec 28, 2015 at 8:53 AM, Willem Jan Withagenwrote: > Hi, > > Can somebody try to help me and explain why > > in test: Func: test/mon/osd-crash > Func: TEST_crush_reject_empty started > > Fails with a python error which sort of startles me: > test/mon/osd-crush.sh:227: TEST_crush_reject_empty: local > empty_map=testdir/osd-crush/empty_map > test/mon/osd-crush.sh:228: TEST_crush_reject_empty: : > test/mon/osd-crush.sh:229: TEST_crush_reject_empty: ./crushtool -c > testdir/osd-crush/empty_map.txt -o testdir/osd-crush/empty_map.m > ap > test/mon/osd-crush.sh:230: TEST_crush_reject_empty: expect_failure > testdir/osd-crush 'Error EINVAL' ./ceph osd setcrushmap -i testd > ir/osd-crush/empty_map.map > ../qa/workunits/ceph-helpers.sh:1171: expect_failure: local > dir=testdir/osd-crush > ../qa/workunits/ceph-helpers.sh:1172: expect_failure: shift > ../qa/workunits/ceph-helpers.sh:1173: expect_failure: local 'expected=Error > EINVAL' > ../qa/workunits/ceph-helpers.sh:1174: expect_failure: shift > ../qa/workunits/ceph-helpers.sh:1175: expect_failure: local success > ../qa/workunits/ceph-helpers.sh:1176: expect_failure: pwd > ../qa/workunits/ceph-helpers.sh:1177: expect_failure: printenv > ../qa/workunits/ceph-helpers.sh:1178: expect_failure: echo ./ceph osd > setcrushmap -i testdir/osd-crush/empty_map.map > ../qa/workunits/ceph-helpers.sh:1180: expect_failure: ./ceph osd > setcrushmap -i testdir/osd-crush/empty_map.map > *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** > Traceback (most recent call last): > File "./ceph", line 936, in > retval = main() > File "./ceph", line 874, in main > sigdict, inbuf, verbose) > File "./ceph", line 457, in new_style_command > inbuf=inbuf) > File "/usr/srcs/Ceph/wip-freebsd-wjw/ceph/src/pybind/ceph_argparse.py", > line 1208, in json_command > raise RuntimeError('"{0}": exception {1}'.format(argdict, e)) > RuntimeError: "{'prefix': u'osd setcrushmap'}": exception "['{"prefix": "osd > setcrushmap"}']": exception 'utf8' codec can't decode b > yte 0x86 in position 56: invalid start byte > > Which is certainly not the type of error expected. > But it is hard to detect any 0x86 in the arguments. > > And yes python is right, there are no UTF8 sequences that start with 0x86. > Question is: > Why does it want to parse with UTF8? > And how do I switch it off? > Or how to I fix this error? I've not handled this myself but we've seen this a few times. The latest example in a quick email search was http://tracker.ceph.com/issues/9405, and it was apparently having a string which wasn't null-terminated. -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: FreeBSD Building and Testing
On Mon, Dec 28, 2015 at 05:53:04PM +0100, Willem Jan Withagen wrote: > Hi, > > Can somebody try to help me and explain why > > in test: Func: test/mon/osd-crash > Func: TEST_crush_reject_empty started > > Fails with a python error which sort of startles me: > test/mon/osd-crush.sh:227: TEST_crush_reject_empty: local > empty_map=testdir/osd-crush/empty_map > test/mon/osd-crush.sh:228: TEST_crush_reject_empty: : > test/mon/osd-crush.sh:229: TEST_crush_reject_empty: ./crushtool -c > testdir/osd-crush/empty_map.txt -o testdir/osd-crush/empty_map.m > ap > test/mon/osd-crush.sh:230: TEST_crush_reject_empty: expect_failure > testdir/osd-crush 'Error EINVAL' ./ceph osd setcrushmap -i testd > ir/osd-crush/empty_map.map > ../qa/workunits/ceph-helpers.sh:1171: expect_failure: local > dir=testdir/osd-crush > ../qa/workunits/ceph-helpers.sh:1172: expect_failure: shift > ../qa/workunits/ceph-helpers.sh:1173: expect_failure: local 'expected=Error > EINVAL' > ../qa/workunits/ceph-helpers.sh:1174: expect_failure: shift > ../qa/workunits/ceph-helpers.sh:1175: expect_failure: local success > ../qa/workunits/ceph-helpers.sh:1176: expect_failure: pwd > ../qa/workunits/ceph-helpers.sh:1177: expect_failure: printenv > ../qa/workunits/ceph-helpers.sh:1178: expect_failure: echo ./ceph osd > setcrushmap -i testdir/osd-crush/empty_map.map > ../qa/workunits/ceph-helpers.sh:1180: expect_failure: ./ceph osd > setcrushmap -i testdir/osd-crush/empty_map.map > *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** > Traceback (most recent call last): > File "./ceph", line 936, in > retval = main() > File "./ceph", line 874, in main > sigdict, inbuf, verbose) > File "./ceph", line 457, in new_style_command > inbuf=inbuf) > File "/usr/srcs/Ceph/wip-freebsd-wjw/ceph/src/pybind/ceph_argparse.py", > line 1208, in json_command > raise RuntimeError('"{0}": exception {1}'.format(argdict, e)) > RuntimeError: "{'prefix': u'osd setcrushmap'}": exception "['{"prefix": "osd > setcrushmap"}']": exception 'utf8' codec can't decode b > yte 0x86 in position 56: invalid start byte > > Which is certainly not the type of error expected. > But it is hard to detect any 0x86 in the arguments. Are you able to reproduce this problem manually? I.e. in src dir, start the cluster using vstart.sh: ./vstart.sh -n Check it is running: ./ceph -s Repeat the test: truncate -s 0 empty_map.txt ./crushtool -c empty_map.txt -o empty_map.map ./ceph osd setcrushmap -i empty_map.map Expected output: "Error EINVAL: Failed crushmap test: ./crushtool: exit status: 1" -- Mykola Golub -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: FreeBSD Building and Testing
Hi, Can somebody try to help me and explain why in test: Func: test/mon/osd-crash Func: TEST_crush_reject_empty started Fails with a python error which sort of startles me: test/mon/osd-crush.sh:227: TEST_crush_reject_empty: local empty_map=testdir/osd-crush/empty_map test/mon/osd-crush.sh:228: TEST_crush_reject_empty: : test/mon/osd-crush.sh:229: TEST_crush_reject_empty: ./crushtool -c testdir/osd-crush/empty_map.txt -o testdir/osd-crush/empty_map.m ap test/mon/osd-crush.sh:230: TEST_crush_reject_empty: expect_failure testdir/osd-crush 'Error EINVAL' ./ceph osd setcrushmap -i testd ir/osd-crush/empty_map.map ../qa/workunits/ceph-helpers.sh:1171: expect_failure: local dir=testdir/osd-crush ../qa/workunits/ceph-helpers.sh:1172: expect_failure: shift ../qa/workunits/ceph-helpers.sh:1173: expect_failure: local 'expected=Error EINVAL' ../qa/workunits/ceph-helpers.sh:1174: expect_failure: shift ../qa/workunits/ceph-helpers.sh:1175: expect_failure: local success ../qa/workunits/ceph-helpers.sh:1176: expect_failure: pwd ../qa/workunits/ceph-helpers.sh:1177: expect_failure: printenv ../qa/workunits/ceph-helpers.sh:1178: expect_failure: echo ./ceph osd setcrushmap -i testdir/osd-crush/empty_map.map ../qa/workunits/ceph-helpers.sh:1180: expect_failure: ./ceph osd setcrushmap -i testdir/osd-crush/empty_map.map *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** Traceback (most recent call last): File "./ceph", line 936, in retval = main() File "./ceph", line 874, in main sigdict, inbuf, verbose) File "./ceph", line 457, in new_style_command inbuf=inbuf) File "/usr/srcs/Ceph/wip-freebsd-wjw/ceph/src/pybind/ceph_argparse.py", line 1208, in json_command raise RuntimeError('"{0}": exception {1}'.format(argdict, e)) RuntimeError: "{'prefix': u'osd setcrushmap'}": exception "['{"prefix": "osd setcrushmap"}']": exception 'utf8' codec can't decode b yte 0x86 in position 56: invalid start byte Which is certainly not the type of error expected. But it is hard to detect any 0x86 in the arguments. And yes python is right, there are no UTF8 sequences that start with 0x86. Question is: Why does it want to parse with UTF8? And how do I switch it off? Or how to I fix this error? Thanx, --WjW -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: FreeBSD Building and Testing
On 21-12-2015 01:45, Xinze Chi (信泽) wrote: sorry for delay reply. Please have a try https://github.com/ceph/ceph/commit/ae4a8162eacb606a7f65259c6ac236e144bfef0a. Tried this one first: Testsuite summary for ceph 10.0.1 # TOTAL: 120 # PASS: 100 # SKIP: 0 # XFAIL: 0 # FAIL: 20 # XPASS: 0 # ERROR: 0 So that certainly helps. Have not yet analyzed the log files... But is seems we are getting somewhere. Needed to manually kill a rados access in: | | | \-+- 09792 wjw /bin/sh ../test-driver ./test/ceph_objectstore_tool.py | | | \-+- 09807 wjw python ./test/ceph_objectstore_tool.py (python2.7) | | | \--- 11406 wjw /usr/srcs/Ceph/wip-freebsd-wjw/ceph/src/.libs/rados -p rep_pool -N put REPobject1 /tmp/data.9807/-REPobject1__head But also 2 mon-osd's were running, and perhaps ine was nog belonging with that test. So they could be in each others way. Found some fails in OSD's at: ./test-suite.log:osd/ECBackend.cc: 201: FAILED assert(res.errors.empty()) ./test-suite.log:osd/ECBackend.cc: 201: FAILED assert(res.errors.empty()) struct OnRecoveryReadComplete : public GenContext&> { ECBackend *pg; hobject_t hoid; set want; OnRecoveryReadComplete(ECBackend *pg, const hobject_t ) : pg(pg), hoid(hoid) {} void finish(pair ) { ECBackend::read_result_t = in.second; // FIXME??? assert(res.r == 0); 201:assert(res.errors.empty()); assert(res.returned.size() == 1); pg->handle_recovery_read_complete( hoid, res.returned.back(), res.attrs, in.first); } }; Given the FIXME?? the code here could be fishy?? I would say that just this patch would be sufficient. The second patch also looks like it is could be useful since it lowers the bar on being tested. And when just aligning is required because of (a)iovec processing that 4096 will likely suffice. Thanx you very much for the help. --WjW 2015-12-21 0:10 GMT+08:00 Willem Jan Withagen : Hi, Most of the Ceph is getting there in the most crude and rough state. So beneath is a status update on what is not working for me jet. Especially help with the aligment problem in os/FileJournal.cc would be appricated... It would allow me to run ceph-osd and run more tests to completion. What would happen if I comment out this test, and ignore the fact that thing might be unaligned? Is it a performance/paging issue? Or is data going to be corrupted? --WjW PASS: src/test/run-cli-tests Testsuite summary for ceph 10.0.0 # TOTAL: 1 # PASS: 1 # SKIP: 0 # XFAIL: 0 # FAIL: 0 # XPASS: 0 # ERROR: 0 gmake test: Testsuite summary for ceph 10.0.0 # TOTAL: 119 # PASS: 95 # SKIP: 0 # XFAIL: 0 # FAIL: 24 # XPASS: 0 # ERROR: 0 The folowing notes can be made with this: 1) the run-cli-tests run to completion because I excluded the RBD tests 2) gmake test has the following tests FAIL: FAIL: unittest_erasure_code_plugin FAIL: ceph-detect-init/run-tox.sh FAIL: test/erasure-code/test-erasure-code.sh FAIL: test/erasure-code/test-erasure-eio.sh FAIL: test/run-rbd-unit-tests.sh FAIL: test/ceph_objectstore_tool.py FAIL: test/test-ceph-helpers.sh FAIL: test/cephtool-test-osd.sh FAIL: test/cephtool-test-mon.sh FAIL: test/cephtool-test-mds.sh FAIL: test/cephtool-test-rados.sh FAIL: test/mon/osd-crush.sh FAIL: test/osd/osd-scrub-repair.sh FAIL: test/osd/osd-scrub-snaps.sh FAIL: test/osd/osd-config.sh FAIL: test/osd/osd-bench.sh FAIL: test/osd/osd-reactivate.sh FAIL: test/osd/osd-copy-from.sh FAIL: test/libradosstriper/rados-striper.sh FAIL: test/test_objectstore_memstore.sh FAIL: test/ceph-disk.sh FAIL: test/pybind/test_ceph_argparse.py FAIL: test/pybind/test_ceph_daemon.py FAIL: ../qa/workunits/erasure-code/encode-decode-non-regression.sh Most of the fails are because ceph-osd crashed consistently on: -1 journal bl.is_aligned(block_size) 0 bl.is_n_align_sized(CEPH_MINIMUM_BLOCK_SIZE) 1 -1 journal block_size 131072 CEPH_MINIMUM_BLOCK_SIZE 4096 CEPH_PAGE_SIZE 4096 header.alignment 131072 bl buffer::list(len=131072, buffer::ptr(0~131072 0x805319000 in raw 0x805319000 len 131072 nref 1)) os/FileJournal.cc: In function 'void FileJournal::align_bl(off64_t, bufferlist &)' thread 805217400 time 2015-12-19 13:43:06.706797
Re: FreeBSD Building and Testing
On 20-12-2015 17:10, Willem Jan Withagen wrote: Hi, Most of the Ceph is getting there in the most crude and rough state. So beneath is a status update on what is not working for me jet. Further: A) unittest_erasure_code_plugin failes on the fact that there is a different error code returned when dlopen-ing a non existent library. load dlopen(.libs/libec_invalid.so): Cannot open ".libs/libec_invalid.so"load dlsym(.libs/libec_missing_version.so, _ _erasure_code_init): Undefined symbol "__erasure_code_init"test/erasure-code/TestErasureCodePlugin.cc:88: Failure Value of: instance.factory("missing_version", g_conf->erasure_code_dir, profile, _code, ) Actual: -2 Expected: -18 EXDEV is actually 18, so that part is correct. But EXDEV is cross-device link error. Where as the actual answer: -2 is factual correct: #define ENOENT 2 /* No such file or directory */ So why is the test for EXDEV instead of ENOENT? Could be a typical Linux <> FreeBSD thingy. --WjW -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: FreeBSD Building and Testing
sorry for delay reply. Please have a try https://github.com/ceph/ceph/commit/ae4a8162eacb606a7f65259c6ac236e144bfef0a. 2015-12-21 0:10 GMT+08:00 Willem Jan Withagen: > Hi, > > Most of the Ceph is getting there in the most crude and rough state. > So beneath is a status update on what is not working for me jet. > > Especially help with the aligment problem in os/FileJournal.cc would be > appricated... It would allow me to run ceph-osd and run more tests to > completion. > > What would happen if I comment out this test, and ignore the fact that > thing might be unaligned? > Is it a performance/paging issue? > Or is data going to be corrupted? > > --WjW > > PASS: src/test/run-cli-tests > > Testsuite summary for ceph 10.0.0 > > # TOTAL: 1 > # PASS: 1 > # SKIP: 0 > # XFAIL: 0 > # FAIL: 0 > # XPASS: 0 > # ERROR: 0 > > > gmake test: > > Testsuite summary for ceph 10.0.0 > > # TOTAL: 119 > # PASS: 95 > # SKIP: 0 > # XFAIL: 0 > # FAIL: 24 > # XPASS: 0 > # ERROR: 0 > > > The folowing notes can be made with this: > 1) the run-cli-tests run to completion because I excluded the RBD tests > 2) gmake test has the following tests FAIL: > FAIL: unittest_erasure_code_plugin > FAIL: ceph-detect-init/run-tox.sh > FAIL: test/erasure-code/test-erasure-code.sh > FAIL: test/erasure-code/test-erasure-eio.sh > FAIL: test/run-rbd-unit-tests.sh > FAIL: test/ceph_objectstore_tool.py > FAIL: test/test-ceph-helpers.sh > FAIL: test/cephtool-test-osd.sh > FAIL: test/cephtool-test-mon.sh > FAIL: test/cephtool-test-mds.sh > FAIL: test/cephtool-test-rados.sh > FAIL: test/mon/osd-crush.sh > FAIL: test/osd/osd-scrub-repair.sh > FAIL: test/osd/osd-scrub-snaps.sh > FAIL: test/osd/osd-config.sh > FAIL: test/osd/osd-bench.sh > FAIL: test/osd/osd-reactivate.sh > FAIL: test/osd/osd-copy-from.sh > FAIL: test/libradosstriper/rados-striper.sh > FAIL: test/test_objectstore_memstore.sh > FAIL: test/ceph-disk.sh > FAIL: test/pybind/test_ceph_argparse.py > FAIL: test/pybind/test_ceph_daemon.py > FAIL: ../qa/workunits/erasure-code/encode-decode-non-regression.sh > > Most of the fails are because ceph-osd crashed consistently on: > -1 journal bl.is_aligned(block_size) 0 > bl.is_n_align_sized(CEPH_MINIMUM_BLOCK_SIZE) 1 > -1 journal block_size 131072 CEPH_MINIMUM_BLOCK_SIZE 4096 > CEPH_PAGE_SIZE 4096 header.alignment 131072 > bl buffer::list(len=131072, buffer::ptr(0~131072 0x805319000 in raw > 0x805319000 len 131072 nref 1)) > os/FileJournal.cc: In function 'void FileJournal::align_bl(off64_t, > bufferlist &)' thread 805217400 time 2015-12-19 13:43:06.706797 > os/FileJournal.cc: 1045: FAILED assert(0 == "bl should be align") > > This is bugging me already for a few days, but I haven't found an easy > way to debug this, run it in gdb while being live or in post-mortum. > > Further: > A) unittest_erasure_code_plugin failes on the fact that there is a > different error code returned when dlopen-ing a non existent library. > load dlopen(.libs/libec_invalid.so): Cannot open > ".libs/libec_invalid.so"load dlsym(.libs/libec_missing_version.so, _ > _erasure_code_init): Undefined symbol > "__erasure_code_init"test/erasure-code/TestErasureCodePlugin.cc:88: Failure > Value of: instance.factory("missing_version", g_conf->erasure_code_dir, > profile, _code, ) > Actual: -2 > Expected: -18 > load dlsym(.libs/libec_missing_entry_point.so, __erasure_code_init): > Undefined symbol "__erasure_code_init"erasure_co > de_init(fail_to_initialize,.libs): (3) No such processload > __erasure_code_init()did not register fail_to_registerload > : example erasure_code_init(example,.libs): (17) File existsload: > example [ FAILED ] ErasureCodePluginRegistryTest. > all (330 ms) > > B) ceph-detect-init/run-tox.sh failes on the fact that I need to work in > FreeBSD in the tests. > > C) ./gtest/include/gtest/internal/gtest-port.h:1358:: Condition > has_owner_ && pthread_equal(owner_, pthread_se > lf()) failed. The current thread is not holding the mutex @0x161ef20 > ./test/run-rbd-unit-tests.sh: line 9: 78053 Abort trap > (core dumped) unittest_librbd > > Which I think I found some commit comments about in either trac or git > about FreeBSD not being able to do things to its own thread. Got to look > into this. > > D) Fix some of the other python code to work as expected. > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Regards, Xinze Chi -- To
Fwd: FreeBSD Building and Testing
-- Forwarded message -- From: Xinze Chi (信泽) <xmdx...@gmail.com> Date: 2015-12-21 8:59 GMT+08:00 Subject: Re: FreeBSD Building and Testing To: Willem Jan Withagen <w...@digiware.nl> Please try this patch https://github.com/XinzeChi/ceph/commit/f4d5bd01a2e498e850e3a43fb1233ba40d8d1781 again. and tell me which patch could fix the bug. Thanks. 2015-12-21 8:45 GMT+08:00 Xinze Chi (信泽) <xmdx...@gmail.com>: > sorry for delay reply. Please have a try > https://github.com/ceph/ceph/commit/ae4a8162eacb606a7f65259c6ac236e144bfef0a. > > 2015-12-21 0:10 GMT+08:00 Willem Jan Withagen <w...@digiware.nl>: >> Hi, >> >> Most of the Ceph is getting there in the most crude and rough state. >> So beneath is a status update on what is not working for me jet. >> >> Especially help with the aligment problem in os/FileJournal.cc would be >> appricated... It would allow me to run ceph-osd and run more tests to >> completion. >> >> What would happen if I comment out this test, and ignore the fact that >> thing might be unaligned? >> Is it a performance/paging issue? >> Or is data going to be corrupted? >> >> --WjW >> >> PASS: src/test/run-cli-tests >> >> Testsuite summary for ceph 10.0.0 >> >> # TOTAL: 1 >> # PASS: 1 >> # SKIP: 0 >> # XFAIL: 0 >> # FAIL: 0 >> # XPASS: 0 >> # ERROR: 0 >> >> >> gmake test: >> >> Testsuite summary for ceph 10.0.0 >> >> # TOTAL: 119 >> # PASS: 95 >> # SKIP: 0 >> # XFAIL: 0 >> # FAIL: 24 >> # XPASS: 0 >> # ERROR: 0 >> >> >> The folowing notes can be made with this: >> 1) the run-cli-tests run to completion because I excluded the RBD tests >> 2) gmake test has the following tests FAIL: >> FAIL: unittest_erasure_code_plugin >> FAIL: ceph-detect-init/run-tox.sh >> FAIL: test/erasure-code/test-erasure-code.sh >> FAIL: test/erasure-code/test-erasure-eio.sh >> FAIL: test/run-rbd-unit-tests.sh >> FAIL: test/ceph_objectstore_tool.py >> FAIL: test/test-ceph-helpers.sh >> FAIL: test/cephtool-test-osd.sh >> FAIL: test/cephtool-test-mon.sh >> FAIL: test/cephtool-test-mds.sh >> FAIL: test/cephtool-test-rados.sh >> FAIL: test/mon/osd-crush.sh >> FAIL: test/osd/osd-scrub-repair.sh >> FAIL: test/osd/osd-scrub-snaps.sh >> FAIL: test/osd/osd-config.sh >> FAIL: test/osd/osd-bench.sh >> FAIL: test/osd/osd-reactivate.sh >> FAIL: test/osd/osd-copy-from.sh >> FAIL: test/libradosstriper/rados-striper.sh >> FAIL: test/test_objectstore_memstore.sh >> FAIL: test/ceph-disk.sh >> FAIL: test/pybind/test_ceph_argparse.py >> FAIL: test/pybind/test_ceph_daemon.py >> FAIL: ../qa/workunits/erasure-code/encode-decode-non-regression.sh >> >> Most of the fails are because ceph-osd crashed consistently on: >> -1 journal bl.is_aligned(block_size) 0 >> bl.is_n_align_sized(CEPH_MINIMUM_BLOCK_SIZE) 1 >> -1 journal block_size 131072 CEPH_MINIMUM_BLOCK_SIZE 4096 >> CEPH_PAGE_SIZE 4096 header.alignment 131072 >> bl buffer::list(len=131072, buffer::ptr(0~131072 0x805319000 in raw >> 0x805319000 len 131072 nref 1)) >> os/FileJournal.cc: In function 'void FileJournal::align_bl(off64_t, >> bufferlist &)' thread 805217400 time 2015-12-19 13:43:06.706797 >> os/FileJournal.cc: 1045: FAILED assert(0 == "bl should be align") >> >> This is bugging me already for a few days, but I haven't found an easy >> way to debug this, run it in gdb while being live or in post-mortum. >> >> Further: >> A) unittest_erasure_code_plugin failes on the fact that there is a >> different error code returned when dlopen-ing a non existent library. >> load dlopen(.libs/libec_invalid.so): Cannot open >> ".libs/libec_invalid.so"load dlsym(.libs/libec_missing_version.so, _ >> _erasure_code_init): Undefined symbol >> "__erasure_code_init"test/erasure-code/TestErasureCodePlugin.cc:88: Failure >> Value of: instance.factory("missing_version", g_conf->erasure_code_dir, >> profile, _code, ) >> Actual: -2 >> Expected: -18 >> load dlsym(.libs/libec_missing_entr
FreeBSD Building and Testing
Hi, Most of the Ceph is getting there in the most crude and rough state. So beneath is a status update on what is not working for me jet. Especially help with the aligment problem in os/FileJournal.cc would be appricated... It would allow me to run ceph-osd and run more tests to completion. What would happen if I comment out this test, and ignore the fact that thing might be unaligned? Is it a performance/paging issue? Or is data going to be corrupted? --WjW PASS: src/test/run-cli-tests Testsuite summary for ceph 10.0.0 # TOTAL: 1 # PASS: 1 # SKIP: 0 # XFAIL: 0 # FAIL: 0 # XPASS: 0 # ERROR: 0 gmake test: Testsuite summary for ceph 10.0.0 # TOTAL: 119 # PASS: 95 # SKIP: 0 # XFAIL: 0 # FAIL: 24 # XPASS: 0 # ERROR: 0 The folowing notes can be made with this: 1) the run-cli-tests run to completion because I excluded the RBD tests 2) gmake test has the following tests FAIL: FAIL: unittest_erasure_code_plugin FAIL: ceph-detect-init/run-tox.sh FAIL: test/erasure-code/test-erasure-code.sh FAIL: test/erasure-code/test-erasure-eio.sh FAIL: test/run-rbd-unit-tests.sh FAIL: test/ceph_objectstore_tool.py FAIL: test/test-ceph-helpers.sh FAIL: test/cephtool-test-osd.sh FAIL: test/cephtool-test-mon.sh FAIL: test/cephtool-test-mds.sh FAIL: test/cephtool-test-rados.sh FAIL: test/mon/osd-crush.sh FAIL: test/osd/osd-scrub-repair.sh FAIL: test/osd/osd-scrub-snaps.sh FAIL: test/osd/osd-config.sh FAIL: test/osd/osd-bench.sh FAIL: test/osd/osd-reactivate.sh FAIL: test/osd/osd-copy-from.sh FAIL: test/libradosstriper/rados-striper.sh FAIL: test/test_objectstore_memstore.sh FAIL: test/ceph-disk.sh FAIL: test/pybind/test_ceph_argparse.py FAIL: test/pybind/test_ceph_daemon.py FAIL: ../qa/workunits/erasure-code/encode-decode-non-regression.sh Most of the fails are because ceph-osd crashed consistently on: -1 journal bl.is_aligned(block_size) 0 bl.is_n_align_sized(CEPH_MINIMUM_BLOCK_SIZE) 1 -1 journal block_size 131072 CEPH_MINIMUM_BLOCK_SIZE 4096 CEPH_PAGE_SIZE 4096 header.alignment 131072 bl buffer::list(len=131072, buffer::ptr(0~131072 0x805319000 in raw 0x805319000 len 131072 nref 1)) os/FileJournal.cc: In function 'void FileJournal::align_bl(off64_t, bufferlist &)' thread 805217400 time 2015-12-19 13:43:06.706797 os/FileJournal.cc: 1045: FAILED assert(0 == "bl should be align") This is bugging me already for a few days, but I haven't found an easy way to debug this, run it in gdb while being live or in post-mortum. Further: A) unittest_erasure_code_plugin failes on the fact that there is a different error code returned when dlopen-ing a non existent library. load dlopen(.libs/libec_invalid.so): Cannot open ".libs/libec_invalid.so"load dlsym(.libs/libec_missing_version.so, _ _erasure_code_init): Undefined symbol "__erasure_code_init"test/erasure-code/TestErasureCodePlugin.cc:88: Failure Value of: instance.factory("missing_version", g_conf->erasure_code_dir, profile, _code, ) Actual: -2 Expected: -18 load dlsym(.libs/libec_missing_entry_point.so, __erasure_code_init): Undefined symbol "__erasure_code_init"erasure_co de_init(fail_to_initialize,.libs): (3) No such processload __erasure_code_init()did not register fail_to_registerload : example erasure_code_init(example,.libs): (17) File existsload: example [ FAILED ] ErasureCodePluginRegistryTest. all (330 ms) B) ceph-detect-init/run-tox.sh failes on the fact that I need to work in FreeBSD in the tests. C) ./gtest/include/gtest/internal/gtest-port.h:1358:: Condition has_owner_ && pthread_equal(owner_, pthread_se lf()) failed. The current thread is not holding the mutex @0x161ef20 ./test/run-rbd-unit-tests.sh: line 9: 78053 Abort trap (core dumped) unittest_librbd Which I think I found some commit comments about in either trac or git about FreeBSD not being able to do things to its own thread. Got to look into this. D) Fix some of the other python code to work as expected. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html