[jira] [Commented] (TS-3315) Assert after try lock

2015-01-23 Thread taorui (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289020#comment-14289020
 ] 

taorui commented on TS-3315:


it`s great if you document it.
if the caller has interest of the result of remove (cont != NULL), then the 
current thread must hold the lock of cont-mutex
if not (cont == NULL), the try should lock.





 Assert after try lock
 -

 Key: TS-3315
 URL: https://issues.apache.org/jira/browse/TS-3315
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Reporter: Phil Sorber

 In iocore/cache/Cache.cc there is the following:
 {code}
   CACHE_TRY_LOCK(lock, cont-mutex, this_ethread());
   ink_assert(lock.is_locked());
 {code}
 Does it really make sense to try and assert when a try can fail?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-1601) HttpServerSession::release don't close ServerSession if ServerSessionPool locking contention

2013-05-16 Thread taorui (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659499#comment-13659499
 ] 

taorui commented on TS-1601:


we almost have no way to release the ssession to ssession pool except 
the session bucket`s mutex was acquired.
if the ssession scheduled for retry because of the lock contention of 
ssession bucket, who will handle the netvc`s events, especially for the 
timout events.



 HttpServerSession::release don't close ServerSession if ServerSessionPool 
 locking contention
 

 Key: TS-1601
 URL: https://issues.apache.org/jira/browse/TS-1601
 Project: Traffic Server
  Issue Type: Improvement
  Components: Network
Affects Versions: 3.2.0
Reporter: Bin Chen
Assignee: Bin Chen
Priority: Minor
 Fix For: 3.3.1

 Attachments: TS-1601.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (TS-1821) AIO tests don't build with native AIO

2013-04-17 Thread taorui (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13634635#comment-13634635
 ] 

taorui commented on TS-1821:


On 04/18/2013 01:41 AM, James Peach (JIRA) wrote:
It should have been done in ink_aio_init() or something like 
AIOProcessor::start if we have. The problem is ink_aio_init is called 
before the creating event_threads (eventProcessor.start). I don`t know 
why we had the DiskHandler in EThread but not used before neither.





 AIO tests don't build with native AIO
 -

 Key: TS-1821
 URL: https://issues.apache.org/jira/browse/TS-1821
 Project: Traffic Server
  Issue Type: Bug
  Components: Build
Reporter: James Peach
Assignee: weijin
 Fix For: 3.3.3

 Attachments: ts-1821.wj.diff


 /opt/src/trafficserver.git/configure --prefix=/opt/ats 
 --enable-linux-native-aio  make -j 4  make check
 test_AIO-test_AIO.o: In function `main':
 /opt/src/trafficserver.git/iocore/aio/test_AIO.cc:498: undefined reference to 
 `cache_config_threads_per_disk'
 collect2: error: ld returned 1 exit status

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (TS-1405) apply time-wheel scheduler about event system

2013-04-08 Thread taorui (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625492#comment-13625492
 ] 

taorui commented on TS-1405:


On 04/08/2013 11:27 PM, John Plevyak (JIRA) wrote:
yes, on an unloaded system, there exists the problem you have mentioned. 
should we add a trigger
mechanism to wake up the thread from epoll_wait for disk io event ? I 
chose the scheme for it is easy-implemented.

 [ 
https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625460#comment-13625460
 ]

John Plevyak commented on TS-1405:
--

The patch includes:

+#if AIO_MODE == AIO_MODE_NATIVE
+#define AIO_PERIOD-HRTIME_MSECONDS(4)
+#else

Even if it was set to zero, on an unloaded system it would only get
polled every 10 msecs because that is the poll rate for epoll(), so
you could potentially delay a disk IO by that amount of time.








 apply time-wheel scheduler  about event system
 --

 Key: TS-1405
 URL: https://issues.apache.org/jira/browse/TS-1405
 Project: Traffic Server
  Issue Type: Improvement
  Components: Core
Affects Versions: 3.2.0
Reporter: Bin Chen
Assignee: Bin Chen
 Fix For: 3.3.2

 Attachments: linux_time_wheel.patch, linux_time_wheel_v10jp.patch, 
 linux_time_wheel_v11jp.patch, linux_time_wheel_v2.patch, 
 linux_time_wheel_v3.patch, linux_time_wheel_v4.patch, 
 linux_time_wheel_v5.patch, linux_time_wheel_v6.patch, 
 linux_time_wheel_v7.patch, linux_time_wheel_v8.patch, 
 linux_time_wheel_v9jp.patch


 when have more and more event in event system scheduler, it's worse. This is 
 the reason why we use inactivecop to handler keepalive. the new scheduler is 
 time-wheel. It's have better time complexity(O(1))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (TS-1528) ats_memalign: couldn't allocate -548249600 bytes in Vol::init()

2012-10-16 Thread taorui (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477521#comment-13477521
 ] 

taorui commented on TS-1528:


we had some discussion in irc, his has a disk(3TB) and the average obj
size is 8K and not set muti volumes, so TS need about (3*2^40 / 8000) *
sizeof(Dir), which is larger than 2^31. 
  





 ats_memalign: couldn't allocate -548249600 bytes in Vol::init()
 ---

 Key: TS-1528
 URL: https://issues.apache.org/jira/browse/TS-1528
 Project: Traffic Server
  Issue Type: Bug
Affects Versions: 3.2.0
 Environment: Debian testing (wheezy) on i686
Reporter: Jack Bates

 I consistently get the following error whenever I try to start Traffic Server 
 (release 3.2.0). Yesterday I built Traffic Server from Git HEAD (34a2ba) to 
 check if it behaves any differently, but I consistently reproduce this same 
 error whenever I try to start it, too
 Here's my configuration, which is pretty minimal: 
 http://nottheoilrig.com/trafficserver/201210120/
 What details can I provide to help debug this? James Peach suggested 
 attaching some kind of dump of the volume header: 
 http://mail-archives.apache.org/mod_mbox/trafficserver-users/201210.mbox/%3C9ED91AE2-2F52-4BDB-9088-E14D40642C34%40apache.org%3E
 {code}
 administrator@debian$ TS_ROOT=/home/administrator/trafficserver 
 trafficserver/traffic_server
 [TrafficServer] using root directory '/home/administrator/trafficserver'
 FATAL: ats_memalign: couldn't allocate -548249600 bytes at alignment 4096 - 
 insufficient memory
 trafficserver/traffic_server - STACK TRACE:
 trafficserver/libtsutil.so.3(+0x1075b)[0xb76d075b]
 trafficserver/libtsutil.so.3(ats_memalign+0xa1)[0xb76d34c1]
 trafficserver/traffic_server(_ZN3Vol4initEPcxxb+0x282)[0x827bd52]
 trafficserver/traffic_server(_ZN5Cache4openEbb+0x5d8)[0x827dc48]
 trafficserver/traffic_server(_ZN14CacheProcessor15diskInitializedEv+0x323)[0x827e0d3]
 trafficserver/traffic_server(_ZN9CacheDisk9openStartEiPv+0x483)[0x828c9c3]
 trafficserver/traffic_server(_ZN19AIOCallbackInternal11io_completeEiPv+0x25)[0x8280a75]
 trafficserver/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x8b)[0x830343b]
 trafficserver/traffic_server(_ZN7EThread7executeEv+0x723)[0x8304003]
 trafficserver/traffic_server(main+0x178d)[0x80c572d]
 /lib/i386-linux-gnu/i686/cmov/libc.so.6(__libc_start_main+0xe6)[0xb7039e46]
 trafficserver/traffic_server[0x80cabdd]
 administrator@debian:~$ 
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


TS server_session bind to client_session

2012-09-05 Thread taorui
I made a mistake the last time I said the server_session will bond to
client_session which only happens when we not set the limit of
proxy.config.http.server_max_connections
and
proxy.config.http.origin_max_connections.





Re: [jira] [Commented] (TS-621) writing 0 bytes to the HTTP cache means only update the header... need a new API: update_header_only() to allow 0 byte files to be cached

2012-04-19 Thread taorui
John: the patch was just a temporary solution and I did not take into
account this situation you mentioned (even did not know). So if you have
any ideas about it, tell me.

On Thu, 2012-04-19 at 03:51 +, John Plevyak (Commented) (JIRA)
wrote:
 [ 
 https://issues.apache.org/jira/browse/TS-621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257197#comment-13257197
  ] 
 
 John Plevyak commented on TS-621:
 -
 
 weijin, your patch relies on the Content-Length: header.   That is probably 
 safe and covers most situations, but I don't know that it will work in all 
 situations.  Old school servers (1.0) which don't use Content-Length: will 
 not benefit.  I have to admit that it does finesse the compatibility issues, 
 but I was hoping for a complete solution.  Perhaps we can look at using 
 some combination of the patches whereby we use the API changes I am proposing 
 and verify that we are doing the right thing with the Content-Length and 
 perhaps use your flag?
 
 I'll look into it as well. 
 
  writing 0 bytes to the HTTP cache means only update the header... need a 
  new API: update_header_only() to allow 0 byte files to be cached
  -
 
  Key: TS-621
  URL: https://issues.apache.org/jira/browse/TS-621
  Project: Traffic Server
   Issue Type: Improvement
   Components: Cache
 Affects Versions: 2.1.5
 Reporter: John Plevyak
 Assignee: weijin
  Fix For: 3.1.4
 
  Attachments: TS-621_cluster_zero_size_objects.patch, 
  force_empty.diff, ts-621-jp-1.patch, ts-621-jp-2.patch, ts-621-jp-3.patch
 
 
 
 
 --
 This message is automatically generated by JIRA.
 If you think it was sent incorrectly, please contact your JIRA 
 administrators: 
 https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
 For more information on JIRA, see: http://www.atlassian.com/software/jira
 
 





Re: [jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent

2012-03-21 Thread taorui
excellent, thanks again. 

On Wed, 2012-03-21 at 14:59 +, John Plevyak (Commented) (JIRA)
wrote:
 [ 
 https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234396#comment-13234396
  ] 
 
 John Plevyak commented on TS-1158:
 --
 
 Note that when replacing a mutex, both the new and old mutexes must be held.  
  Also note that this protection (double checking) is only provided in the 
 NetProcessor as it is the only Processor whose VC mutexes are switched.  Any 
 virtualization would need to provide the same protection.
 
  Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
  
 
  Key: TS-1158
  URL: https://issues.apache.org/jira/browse/TS-1158
  Project: Traffic Server
   Issue Type: Bug
   Components: Core
 Affects Versions: 3.0.3
  Environment: ALL
 Reporter: John Plevyak
 Assignee: John Plevyak
  Fix For: 3.1.4
 
  Attachments: ts-1158-jp1.patch
 
 
  Because of the way session management works, the vio.mutex must be 
  re-verified to be identical to the one the lock was taken on after the lock 
  is acquired.  Otherwise there is a race when the mutex is switched allowing 
  such that the old lock is held while the new lock is in not held.
 
 --
 This message is automatically generated by JIRA.
 If you think it was sent incorrectly, please contact your JIRA 
 administrators: 
 https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
 For more information on JIRA, see: http://www.atlassian.com/software/jira
 
 





[jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent

2012-03-21 Thread taorui (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234459#comment-13234459
 ] 

taorui commented on TS-1158:


excellent, thanks again. 

On Wed, 2012-03-21 at 14:59 +, John Plevyak (Commented) (JIRA)





 Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
 

 Key: TS-1158
 URL: https://issues.apache.org/jira/browse/TS-1158
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 3.0.3
 Environment: ALL
Reporter: John Plevyak
Assignee: John Plevyak
 Fix For: 3.1.4

 Attachments: ts-1158-jp1.patch


 Because of the way session management works, the vio.mutex must be 
 re-verified to be identical to the one the lock was taken on after the lock 
 is acquired.  Otherwise there is a race when the mutex is switched allowing 
 such that the old lock is held while the new lock is in not held.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent

2012-03-21 Thread taorui
I am afraid the race is (may be one of) the root cause of TS-857, but I
am not sure.

On Wed, 2012-03-21 at 14:53 +, John Plevyak (Commented) (JIRA)
wrote:
 [ 
 https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234393#comment-13234393
  ] 
 
 John Plevyak commented on TS-1158:
 --
 
 The mutex switch occurs in the HttpSessionManager.  When a session is passed 
 to it, the read.vio.mutex and write.vio.mutex from the old controlling HttpSM 
 are replaced with that of a hash bucket of sessions in the Manager (a hash to 
 reduce contention on this globally shared data structure).  When a session is 
 requested from the HttpSessionManager, they are replaced with those of the 
 new HttpSM which will be using that OS connection.  During the swap, the 
 previous and new mutexes are held, but nevertheless, a race is possible if a 
 thread grabs the old (pre substitution) mutex, then a context switch occurs 
 and the mutexes are swapped and the old mutex (pre substitute) lock is 
 released, then the first thread resumes, locks the (pre substitution) mutex 
 and now two threads are running while thinking they are holding the mutex for 
 the NetVC.  The solution is to ensure, after the lock has been taken, that 
 the mutex we have locked is the same one that is protecting the NetVC.  If it 
 is not, we back out and retry later.
 
  Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
  
 
  Key: TS-1158
  URL: https://issues.apache.org/jira/browse/TS-1158
  Project: Traffic Server
   Issue Type: Bug
   Components: Core
 Affects Versions: 3.0.3
  Environment: ALL
 Reporter: John Plevyak
 Assignee: John Plevyak
  Fix For: 3.1.4
 
  Attachments: ts-1158-jp1.patch
 
 
  Because of the way session management works, the vio.mutex must be 
  re-verified to be identical to the one the lock was taken on after the lock 
  is acquired.  Otherwise there is a race when the mutex is switched allowing 
  such that the old lock is held while the new lock is in not held.
 
 --
 This message is automatically generated by JIRA.
 If you think it was sent incorrectly, please contact your JIRA 
 administrators: 
 https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
 For more information on JIRA, see: http://www.atlassian.com/software/jira
 






[jira] [Commented] (TS-1158) Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent

2012-03-21 Thread taorui (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235328#comment-13235328
 ] 

taorui commented on TS-1158:


I am afraid the race is (may be one of) the root cause of TS-857, but I
am not sure.

On Wed, 2012-03-21 at 14:53 +, John Plevyak (Commented) (JIRA)





 Race on mutex switching for NetVConnections in UnixNetVConnection::mainEvent
 

 Key: TS-1158
 URL: https://issues.apache.org/jira/browse/TS-1158
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 3.0.3
 Environment: ALL
Reporter: John Plevyak
Assignee: John Plevyak
 Fix For: 3.1.4

 Attachments: ts-1158-jp1.patch


 Because of the way session management works, the vio.mutex must be 
 re-verified to be identical to the one the lock was taken on after the lock 
 is acquired.  Otherwise there is a race when the mutex is switched allowing 
 such that the old lock is held while the new lock is in not held.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [jira] [Commented] (TS-899) ts crash

2011-10-12 Thread taorui
I`ll take care of it.
On Tue, 2011-10-11 at 22:53 +, Leif Hedstrom (Commented) (JIRA)
wrote:
 [ 
 https://issues.apache.org/jira/browse/TS-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125457#comment-13125457
  ] 
 
 Leif Hedstrom commented on TS-899:
 --
 
 Any update on this? Is anyone working on this ?
 
  ts crash
  
 
  Key: TS-899
  URL: https://issues.apache.org/jira/browse/TS-899
  Project: Traffic Server
   Issue Type: Sub-task
   Components: HTTP, MIME
 Affects Versions: 3.0.1
  Environment: readhat5.5, ts-3.0.1, X86-64
 Reporter: weijin
 Assignee: weijin
  Fix For: 3.1.1
 
 
  If a request url is forbidden then redirected to another url, TS crash.
 
 --
 This message is automatically generated by JIRA.
 If you think it was sent incorrectly, please contact your JIRA 
 administrators: 
 https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
 For more information on JIRA, see: http://www.atlassian.com/software/jira
 
 





Re:[jira] [Commented] (TS-937) EThread::execute still processing cancelled event

2011-09-06 Thread taorui
hmm, why the check of event cancel in process_event did not include the timeout 
event?





At 2011-09-06 08:07:09,Brian Geffon (JIRA) j...@apache.org wrote:

[ 
 https://issues.apache.org/jira/browse/TS-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097644#comment-13097644
  ] 

Brian Geffon commented on TS-937:
-

Thanks for the response weijin. Perhaps I'm missing something but there is 
currently a check for the event being cancelled in ProcessEvent for all events 
that do not have a timeout, if the event has some timeout on it then there 
won't be a check, why would it be safe to put the cancel check in ProcessEvent 
in that situation?

[http://svn.apache.org/viewvc/trafficserver/traffic/trunk/iocore/eventsystem/UnixEThread.cc?view=markup#l234]



 EThread::execute still processing cancelled event
 -

 Key: TS-937
 URL: https://issues.apache.org/jira/browse/TS-937
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 3.0.1, 2.1.9
 Environment: RHEL6
Reporter: Brian Geffon
 Fix For: 3.1.1

 Attachments: UnixEThread.patch


 The included GDB log will show that ATS is trying to process an event that 
 has already been canceled, examining the code of UnixEThread.cc line 232 
 shows that EThread::process_event gets called without a check for the event 
 being cancelled. 
 Brian
 Program received signal SIGSEGV, Segmentation fault.
 [Switching to Thread 0x764fa700 (LWP 28518)]
 0x006fc663 in EThread::process_event (this=0x768ff010, 
 e=0x1db45c0, calling_code=1) at UnixEThread.cc:130
 130  MUTEX_TRY_LOCK_FOR(lock, e-mutex.m_ptr, this, e-continuation);
 Missing separate debuginfos, use: debuginfo-install 
 expat-2.0.1-9.1.el6.x86_64 glibc-2.12-1.25.el6_1.3.x86_64 
 keyutils-libs-1.4-1.el6.x86_64 krb5-libs-1.9-9.el6_1.1.x86_64 
 libcom_err-1.41.12-7.el6.x86_64 libgcc-4.4.5-6.el6.x86_64 
 libselinux-2.0.94-5.el6.x86_64 libstdc++-4.4.5-6.el6.x86_64 
 openssl-1.0.0-10.el6_1.4.x86_64 pcre-7.8-3.1.el6.x86_64 
 tcl-8.5.7-6.el6.x86_64 zlib-1.2.3-25.el6.x86_64
 (gdb) bt
 #0  0x006fc663 in EThread::process_event (this=0x768ff010, 
 e=0x1db45c0, calling_code=1) at UnixEThread.cc:130
 #1  0x006fcbaf in EThread::execute (this=0x768ff010) at 
 UnixEThread.cc:232
 #2  0x006fb844 in spawn_thread_internal (a=0xfb7e80) at Thread.cc:88
 #3  0x0036204077e1 in start_thread () from /lib64/libpthread.so.0
 #4  0x00361f8e577d in clone () from /lib64/libc.so.6
 (gdb) bt full
 #0  0x006fc663 in EThread::process_event (this=0x768ff010, 
 e=0x1db45c0, calling_code=1) at UnixEThread.cc:130
 lock = {m = {m_ptr = 0x764f9d20}, lock_acquired = 202}
 #1  0x006fcbaf in EThread::execute (this=0x768ff010) at 
 UnixEThread.cc:232
 done_one = false
 e = 0x1db45c0
 NegativeQueue = {DLLEvent, Event::Link_link = {head = 0xfc75f0}, 
 tail = 0xfc75f0}
 next_time = 1314647904419648000
 #2  0x006fb844 in spawn_thread_internal (a=0xfb7e80) at Thread.cc:88
 p = 0xfb7e80
 #3  0x0036204077e1 in start_thread () from /lib64/libpthread.so.0
 No symbol table info available.
 #4  0x00361f8e577d in clone () from /lib64/libc.so.6
 No symbol table info available.
 (gdb) f 0
 #0  0x006fc663 in EThread::process_event (this=0x768ff010, 
 e=0x1db45c0, calling_code=1) at UnixEThread.cc:130
 130  MUTEX_TRY_LOCK_FOR(lock, e-mutex.m_ptr, this, e-continuation);
 (gdb) p *e
 $2 = {Action = {_vptr.Action = 0x775170, continuation = 0x1f2fc08, mutex = 
 {m_ptr = 0x7fffd40fba40}, cancelled = 1}, ethread = 0x768ff010, 
 in_the_prot_queue = 0, in_the_priority_queue = 0, 
   immediate = 1, globally_allocated = 1, in_heap = 0, callback_event = 1, 
 timeout_at = 0, period = 0, cookie = 0x0, link = {SLinkEvent = {next = 
 0x0}, prev = 0x0}}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re:[jira] [Created] (TS-896) log collation reporting Host down

2011-07-30 Thread taorui
I think maybe the bug is not in the logging system but in the hostdb. If get 
the wrong address through hostdbprocessor::getby, it will tries to connect 
every 5 sec.
At 2011-07-30 00:06:10,Zhao Yongming (JIRA) j...@apache.org wrote:
log collation reporting Host down
-

 Key: TS-896
 URL: https://issues.apache.org/jira/browse/TS-896
 Project: Traffic Server
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.0.1
Reporter: Zhao Yongming


In my production, the server reporting something very strange regard to 
logging collation:
{code}

[Jul 29 18:58:27.665] Manager {140453854168864} NOTE: [Alarms::signalAlarm] 
Skipping Alarm: 'Collation host 0.0.0.0:0 down'
[Jul 29 18:58:27.665] Manager {140453854168864} NOTE: [Alarms::signalAlarm] 
Skipping Alarm: 'Collation host 0.0.0.0:0 down'
[Jul 29 18:58:29.793] Server {47330630715728} NOTE: [log-coll] host down 
[133.0.0.0:0]
[Jul 29 18:58:30.491] Server {1106377024} NOTE: [log-coll] host down 
[50.52.120.55:1953855031]
[Jul 29 18:58:31.419] Server {1105324352} NOTE: [log-coll] host down 
[0.0.0.0:0]
[Jul 29 18:58:31.681] Server {1099974976} NOTE: [log-coll] host down 
[0.0.0.0:1]
[Jul 29 18:58:31.739] Server {1104271680} NOTE: [log-coll] host down 
[0.0.0.0:-139068]
[Jul 29 18:58:31.811] Server {1105324352} NOTE: [log-coll] host down 
[0.0.0.0:0]
[Jul 29 18:58:31.872] Server {1106377024} NOTE: [log-coll] host down 
[0.0.0.0:0]
[Jul 29 18:58:31.899] Server {1107429696} NOTE: [log-coll] host down 
[0.0.0.0:0]
[Jul 29 18:58:32.666] Manager {140453854168864} NOTE: [Alarms::signalAlarm] 
Skipping Alarm: 'Collation host 0.0.0.0:0 down'
[Jul 29 18:58:32.666] Manager {140453854168864} NOTE: [Alarms::signalAlarm] 
Skipping Alarm: 'Collation host 0.0.0.0:0 down'
[Jul 29 18:58:36.419] Server {1105324352} NOTE: [log-coll] host down 
[0.0.0.0:0]
[Jul 29 18:58:36.679] Server {1107429696} NOTE: [log-coll] host down 
[0.0.0.0:1]
[Jul 29 18:58:36.738] Server {1099974976} NOTE: [log-coll] host down 
[0.0.0.0:-139068]
[Jul 29 18:58:36.810] Server {1104271680} NOTE: [log-coll] host down 
[0.0.0.0:0]
[Jul 29 18:58:36.870] Server {1105324352} NOTE: [log-coll] host down 
[0.0.0.0:0]
[Jul 29 18:58:36.899] Server {1107429696} NOTE: [log-coll] host down 
[0.0.0.0:0]
[Jul 29 18:58:37.668] Manager {140453854168864} NOTE: [Alarms::signalAlarm] 
Skipping Alarm: 'Collation host 0.0.0.0:0 down'
[Jul 29 18:58:37.668] Manager {140453854168864} NOTE: [Alarms::signalAlarm] 
Skipping Alarm: 'Collation host 0.0.0.0:0 down'
[Jul 29 18:58:37.668] Manager {140453854168864} NOTE: [Alarms::signalAlarm] 
Skipping Alarm: 'Collation host 0.0.0.0:0 down'
[Jul 29 18:58:39.569] Server {1106377024} NOTE: [log-coll] host down 
[115.113.108.44:745436017]
[Jul 29 18:58:39.585] Server {47330630715728} NOTE: [log-coll] host down 
[208.151.89.0:0]
[Jul 29 18:58:39.588] Server {1107429696} NOTE: [log-coll] host down 
[60.99.114.99:1835561315]
[Jul 29 18:58:39.872] Server {1105324352} NOTE: [log-coll] host down 
[196.0.58.16:100667653]
[Jul 29 18:58:41.420] Server {1099974976} NOTE: [log-coll] host down 
[0.0.0.0:0]
[Jul 29 18:58:41.679] Server {1106377024} NOTE: [log-coll] host down 
[0.0.0.0:1]
[Jul 29 18:58:41.740] Server {47330630715728} NOTE: [log-coll] host down 
[0.0.0.0:-139068]
[Jul 29 18:58:41.814] Server {1099974976} NOTE: [log-coll] host down 
[0.0.0.0:0]
[Jul 29 18:58:41.874] Server {1104271680} NOTE: [log-coll] host down 
[0.0.0.0:0]
[Jul 29 18:58:41.904] Server {1105324352} NOTE: [log-coll] host down 
[0.0.0.0:0]
[Jul 29 18:58:42.669] Manager {140453854168864} NOTE: [Alarms::signalAlarm] 
Skipping Alarm: 'Collation host 0.0.0.0:0 down'
[Jul 29 18:58:42.669] Manager {140453854168864} NOTE: [Alarms::signalAlarm] 
Skipping Alarm: 'Collation host 0.0.0.0:0 down'
[Jul 29 18:58:44.862] Server {1106377024} NOTE: [log-coll] host down 
[128.81.1.0:0]
[Jul 29 18:58:46.420] Server {1105324352} NOTE: [log-coll] host down 
[0.0.0.0:0]
[Jul 29 18:58:46.678] Server {47330630715728} NOTE: [log-coll] host down 
[0.0.0.0:1]
[Jul 29 18:58:46.740] Server {1099974976} NOTE: [log-coll] host down 
[0.0.0.0:-139068]
[Jul 29 18:58:46.818] Server {1104271680} NOTE: [log-coll] host down 
[0.0.0.0:0]
[Jul 29 18:58:46.874] Server {1105324352} NOTE: [log-coll] host down 
[0.0.0.0:0]
[Jul 29 18:58:46.899] Server {1106377024} NOTE: [log-coll] host down 
[0.0.0.0:0]
[Jul 29 18:58:47.670] Manager {140453854168864} NOTE: [Alarms::signalAlarm] 
Skipping Alarm: 'Collation host 0.0.0.0:0 down'
[Jul 29 18:58:47.670] Manager {140453854168864} NOTE: [Alarms::signalAlarm] 
Skipping Alarm: 'Collation host 0.0.0.0:0 down'
[Jul 29 18:58:48.765] Server {1107429696} NOTE: [log-coll] host down 
[73.26.0.0:-1]
[Jul 29 18:58:51.421] Server {1107429696} NOTE: [log-coll] host down 
[0.0.0.0:0]
[Jul 29 18:58:51.678] Server {1105324352} NOTE: [log-coll] host down 
[0.0.0.0:1]
[Jul 29 18:58:51.739] Server {1106377024} NOTE: