On Fri, 2008-06-27 at 17:14 +0200, Manuel Teira wrote: > Alan Conway escribió: > > On Thu, 2008-06-26 at 12:12 +0200, Manuel Teira wrote: > > > >> Hello. > >> After further investigation and tests, related with the change in > >> r671604 to drop the file locking strategy in favour of a flock on the > >> data dir. > >> > >> Trying to write a similar code, but using lockf, I hit the issue that > >> the file must be opened using O_RDWR or O_RWONLY, and that's not allowed > >> for a directory. > >> The same happens trying to use a fcntl call. > >> And unexpectedly, the same for flock. In the solaris manual page: > >> > >> <snip> > >> Read permission is required on a file to obtain a shared > >> lock, and write permission is required to obtain an > >> exclusive lock. > >> </snip> > >> > >> But the linux man page claims: > >> > >> <snip> > >> A shared or exclusive lock can be placed on a file regardless of the > >> mode in which the file was opened. > >> </snip> > >> > >> I've searched the web for some BSD system pages, but they don't say > >> anything about the file mode. > >> > >> > >> On the other way, POSIX fcntl specification says, apropos the failure > >> causes: > >> > >> [EBADF] > >> The /fildes/ argument is not a valid open file descriptor, or the > >> argument /cmd/ is F_SETLK or F_SETLKW, the type of lock, *l_type*, > >> is a shared lock (F_RDLCK), and /fildes/ is not a valid file > >> descriptor open for reading, or the type of lock *l_type*, is an > >> exclusive lock (F_WRLCK), and /fildes/ is not a valid file > >> descriptor open for writing. > >> > >> Posix specs also forces write permissions for lockf: > >> http://www.opengroup.org/onlinepubs/007908799/xsh/lockf.html > >> > >> > >> > >> This leads to solaris not being able to lock directly on a directory, > >> I'm afraid. Any idea? > >> > > > > > > Yes, we can create (if it doesn't already exist) a lock file in the > > directory and then use lockf to lock it. There's already code in > > Daemon.cpp that does exactly this for the PID file. The reason I > > switched to flock was because crashing or killed brokers were sometimes > > leaving the lock file behind them, whereas a flock (or lockf) lock is > > automatically released when the process exits. > > > > We need to > > - create a qpid::sys::LockFile class that can be re-implemented on > > different platforms. > > - use the Daemon.cpp code as the posix implementation. > > - Replace the locking code in Daemon.cpp and DataDir.cpp with the > > common sys::LockFile. > > > > It's JIRA https://issues.apache.org/jira/browse/QPID-1158 > > Could you take this on Manuel? I'll can do it but it may take a couple > > days to get to it. > > > Of course, I will try (will try to start on monday). By the moment I've > reverted changes to keep using the old DataDir.cpp code. I was able to > pass most of the tests on solaris (more changes about bashisms needed, > though), I will have to take a look about some random message, but this > is a dump of a 'make check' session now: > > -bash-3.00$ make check > make libshlibtest.la libdlclose_noop.la unit_test perftest txtest > latencytest client_test topic_listener topic_publisher publish consume > `libshlibtest.la' is up to date. > `libdlclose_noop.la' is up to date. > `unit_test' is up to date. > `perftest' is up to date. > `txtest' is up to date. > `latencytest' is up to date. > `client_test' is up to date. > `topic_listener' is up to date. > `topic_publisher' is up to date. > `publish' is up to date. > `consume' is up to date. > make check-TESTS > Running 154 test cases... > 2008-jun-27 17:09:18 error Exception in client dispatch thread: > Connection closed by broker > > *** No errors detected > PASS: unit_test > PASS: start_broker > PASS: client_test > SubscribeThread exception: Sequence error: expected n==1 but got 0 > (perftest.cpp:524) > FAIL: quick_perftest > PASS: quick_topictest > sh: objdump: not found > test_example (tests_0-10.example.ExampleTest) ... ok > test_auto_rollback (tests_0-10.tx.TxTests) ... ok > test_commit (tests_0-10.tx.TxTests) ... ok > test_rollback (tests_0-10.tx.TxTests) ... ok > test_broker_connectivity (tests_0-10.management.ManagementTest) ... ok > test_self_session_id (tests_0-10.management.ManagementTest) ... ok > test_standard_exchanges (tests_0-10.management.ManagementTest) ... ok > test_system_object (tests_0-10.management.ManagementTest) ... ok > test_bad_resume (tests_0-10.dtx.DtxTests) ... ok > test_commit_unknown (tests_0-10.dtx.DtxTests) ... ok > test_end (tests_0-10.dtx.DtxTests) ... ok > test_end_suspend_and_fail (tests_0-10.dtx.DtxTests) ... ok > test_end_unknown_xid (tests_0-10.dtx.DtxTests) ... ok > test_forget_xid_on_completion (tests_0-10.dtx.DtxTests) ... ok > test_get_timeout (tests_0-10.dtx.DtxTests) ... ok > test_get_timeout_unknown (tests_0-10.dtx.DtxTests) ... ok > test_implicit_end (tests_0-10.dtx.DtxTests) ... ok > test_invalid_commit_not_ended (tests_0-10.dtx.DtxTests) ... ok > test_invalid_commit_one_phase_false (tests_0-10.dtx.DtxTests) ... ok > test_invalid_commit_one_phase_true (tests_0-10.dtx.DtxTests) ... ok > test_invalid_prepare_not_ended (tests_0-10.dtx.DtxTests) ... ok > test_invalid_rollback_not_ended (tests_0-10.dtx.DtxTests) ... ok > test_prepare_unknown (tests_0-10.dtx.DtxTests) ... ok > test_recover (tests_0-10.dtx.DtxTests) ... ok > test_rollback_unknown (tests_0-10.dtx.DtxTests) ... ok > test_select_required (tests_0-10.dtx.DtxTests) ... ok > test_set_timeout (tests_0-10.dtx.DtxTests) ... ok > test_simple_commit (tests_0-10.dtx.DtxTests) ... ok > test_simple_prepare_commit (tests_0-10.dtx.DtxTests) ... ok > test_simple_prepare_rollback (tests_0-10.dtx.DtxTests) ... ok > test_simple_rollback (tests_0-10.dtx.DtxTests) ... ok > test_start_already_known (tests_0-10.dtx.DtxTests) ... ok > test_start_join (tests_0-10.dtx.DtxTests) ... ok > test_start_join_and_resume (tests_0-10.dtx.DtxTests) ... ok > test_suspend_resume (tests_0-10.dtx.DtxTests) ... ok > test_suspend_start_end_resume (tests_0-10.dtx.DtxTests) ... ok > test_delete_while_used_by_exchange > (tests_0-10.alternate_exchange.AlternateExchangeTests) ... ok > test_delete_while_used_by_queue > (tests_0-10.alternate_exchange.AlternateExchangeTests) ... ok > test_queue_delete (tests_0-10.alternate_exchange.AlternateExchangeTests) > ... ok > test_unroutable (tests_0-10.alternate_exchange.AlternateExchangeTests) > ... ok > test (tests_0-10.exchange.DeclareMethodPassiveFieldNotFoundRuleTests) ... ok > testDefaultExchange (tests_0-10.exchange.DefaultExchangeRuleTests) ... ok > testHeadersBindNoMatchArg (tests_0-10.exchange.ExchangeTests) ... ok > testMatchAll (tests_0-10.exchange.HeadersExchangeTests) ... ok > testMatchAny (tests_0-10.exchange.HeadersExchangeTests) ... ok > testDifferentDeclaredType (tests_0-10.exchange.MiscellaneousErrorsTests) > ... ok > testTypeNotKnown (tests_0-10.exchange.MiscellaneousErrorsTests) ... ok > testDirect (tests_0-10.exchange.RecommendedTypesRuleTests) ... ok > testFanout (tests_0-10.exchange.RecommendedTypesRuleTests) ... ok > testHeaders (tests_0-10.exchange.RecommendedTypesRuleTests) ... ok > testTopic (tests_0-10.exchange.RecommendedTypesRuleTests) ... ok > testAmqDirect (tests_0-10.exchange.RequiredInstancesRuleTests) ... ok > testAmqFanOut (tests_0-10.exchange.RequiredInstancesRuleTests) ... ok > testAmqMatch (tests_0-10.exchange.RequiredInstancesRuleTests) ... ok > testAmqTopic (tests_0-10.exchange.RequiredInstancesRuleTests) ... ok > test_ack_and_no_ack (tests_0-10.broker.BrokerTests) ... ok > test_simple_delivery_immediate (tests_0-10.broker.BrokerTests) ... ok > test_simple_delivery_queued (tests_0-10.broker.BrokerTests) ... ok > test_ack (tests_0-10.message.MessageTests) ... ok > test_acquire (tests_0-10.message.MessageTests) ... ok > test_acquire_with_no_accept_and_credit_flow > (tests_0-10.message.MessageTests) ... ok > test_cancel (tests_0-10.message.MessageTests) ... ok > test_consume_exclusive (tests_0-10.message.MessageTests) ... ok > test_consume_exclusive2 (tests_0-10.message.MessageTests) ... ok > test_consume_queue_not_found (tests_0-10.message.MessageTests) ... ok > test_consume_queue_not_specified (tests_0-10.message.MessageTests) ... ok > test_consume_unique_consumers (tests_0-10.message.MessageTests) ... ok > test_credit_flow_bytes (tests_0-10.message.MessageTests) ... ok > test_credit_flow_messages (tests_0-10.message.MessageTests) ... ok > test_empty_body (tests_0-10.message.MessageTests) ... ok > test_incoming_start (tests_0-10.message.MessageTests) ... ok > test_no_local (tests_0-10.message.MessageTests) ... ok > test_no_local_awkward (tests_0-10.message.MessageTests) ... ok > test_no_local_exclusive_subscribe (tests_0-10.message.MessageTests) ... ok > test_ranged_ack (tests_0-10.message.MessageTests) ... ok > test_reject (tests_0-10.message.MessageTests) ... ok > test_release (tests_0-10.message.MessageTests) ... ok > test_release_ordering (tests_0-10.message.MessageTests) ... ok > test_release_unacquired (tests_0-10.message.MessageTests) ... ok > test_subscribe_not_acquired (tests_0-10.message.MessageTests) ... ok > test_subscribe_not_acquired_2 (tests_0-10.message.MessageTests) ... ok > test_subscribe_not_acquired_3 (tests_0-10.message.MessageTests) ... ok > test_window_flow_bytes (tests_0-10.message.MessageTests) ... ok > test_window_flow_messages (tests_0-10.message.MessageTests) ... ok > test_ack_message_from_deleted_queue > (tests_0-10.persistence.PersistenceTests) ... ok > test_delete_queue_after_publish > (tests_0-10.persistence.PersistenceTests) ... ok > test_queue_deletion (tests_0-10.persistence.PersistenceTests) ... ok > test_autodelete_shared (tests_0-10.queue.QueueTests) ... ok > test_bind (tests_0-10.queue.QueueTests) ... ok > test_bind_queue_existence (tests_0-10.queue.QueueTests) ... ok > test_declare_exclusive (tests_0-10.queue.QueueTests) ... ok > test_declare_passive (tests_0-10.queue.QueueTests) ... ok > test_delete_ifempty (tests_0-10.queue.QueueTests) ... ok > test_delete_ifunused (tests_0-10.queue.QueueTests) ... ok > test_delete_queue_exists (tests_0-10.queue.QueueTests) ... ok > test_delete_simple (tests_0-10.queue.QueueTests) ... ok > test_purge (tests_0-10.queue.QueueTests) ... ok > test_purge_empty_name (tests_0-10.queue.QueueTests) ... ok > test_purge_queue_exists (tests_0-10.queue.QueueTests) ... ok > test_unbind_direct (tests_0-10.queue.QueueTests) ... ok > test_unbind_fanout (tests_0-10.queue.QueueTests) ... ok > test_unbind_headers (tests_0-10.queue.QueueTests) ... ok > test_unbind_topic (tests_0-10.queue.QueueTests) ... ok > test_exchange_bound_direct (tests_0-10.query.QueryTests) ... ok > test_exchange_bound_fanout (tests_0-10.query.QueryTests) ... ok > test_exchange_bound_header (tests_0-10.query.QueryTests) ... ok > test_exchange_bound_topic (tests_0-10.query.QueryTests) ... ok > test_exchange_query (tests_0-10.query.QueryTests) ... ok > test_queue_query (tests_0-10.query.QueryTests) ... ok > test_queue_query_unknown (tests_0-10.query.QueryTests) ... ok > > ---------------------------------------------------------------------- > Ran 110 tests in 88.510s > > OK > PASS: python_tests > PASS: stop_broker > Running federation tests using brokers on ports 45428 45429 > sh: objdump: not found > test_bridge_create_and_close (federation.FederationTests) ... ok > test_pull_from_exchange (federation.FederationTests) ... ok > test_pull_from_queue (federation.FederationTests) ... ok > test_tracing (federation.FederationTests) ... ok > > ---------------------------------------------------------------------- > Ran 4 tests in 48.880s > > OK > PASS: run_federation_tests > ============================================== > 1 of 8 tests failed > Please report to [email protected] > ============================================== > > > > > Only a test is failing. There's also a weird message during unit_test > (Exception in client dispatch thread: Connection closed by broker), and
That is not an error, its comming from a test that deliberately provokes various error conditions. It's being printed because the broker logs errors on stderr by default. I can fix the tests to hide this message, thanks for reminding me. > > also those "sh: objdump not found" messages I'm still not sure where > they're coming from, since at a first look I was not able to find any > objdump invocation. Other than that, it gives me hope about having a > solaris working version soon. It looks fantastic, definitely ready for a test drive on Linux. Will try to do this next week.
