Hi guys, just bumping up this issue.
2014-03-18 15:01 GMT-03:00 Alena Prokharchyk <alena.prokharc...@citrix.com>: > Ok, backend reports it correctly. So here is the sequence for the > problem: > > > 1. Backend - VR - report the networking data to the CS MS > 2. Networking data never gets inserted to the cloud_usage DB. > > > Kishan, can you please help to figure out why Usage server doesn't > insert this data to cloud_usage DB? > > Thanks, > Alena. > > From: Diego Spinola Castro <spinolacas...@gmail.com> > Date: Tuesday, March 18, 2014 at 10:29 AM > > To: Alena Prokharchyk <alena.prokharc...@citrix.com> > Cc: "users@cloudstack.apache.org" <users@cloudstack.apache.org>, Kishan > Kavala <kishan.kav...@citrix.com> > Subject: Re: NetworkUsage > > Take a look at: http://www.fpaste.org/86434/51637081/ > > > 2014-03-18 14:15 GMT-03:00 Alena Prokharchyk <alena.prokharc...@citrix.com > >: > >> Diego, >> >> Check for these messages in the CS log: >> >> >> [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-3-VM","bytesSen >> t":0,"bytesReceived":0,"result":true,"details":"success","wait":0}}] } >> >> >> And see if you are seeing non-0 bytesReceived/bytesSent reported from >> the backend for any day after 2013-11-08 >> >> -Alena. >> >> From: Diego Spinola Castro <spinolacas...@gmail.com> >> Date: Tuesday, March 18, 2014 at 10:12 AM >> >> To: Alena Prokharchyk <alena.prokharc...@citrix.com> >> Cc: "users@cloudstack.apache.org" <users@cloudstack.apache.org>, Kishan >> Kavala <kishan.kav...@citrix.com> >> Subject: Re: NetworkUsage >> >> Hi Alena, there's no records after 2013-11-08 (when i did the upgrade >> to CS 4.2). >> >> >> If there are no records for Feb2 in the DB, can you check if CS >> reported any usage during this day? How do i do that? >> >> >> 2014-03-18 13:45 GMT-03:00 Alena Prokharchyk < >> alena.prokharc...@citrix.com>: >> >>> Ok, so having 0 for some router records is actually fine as either the >>> Vrs are stopped, or there are no vms. Good you've confirmed that's not the >>> problem. >>> >>> Now we have to debug something different - API not returning certain >>> types of usage. For that, you have to: >>> >>> >>> 1. go to cloud_usage DB >>> 2. Select * from cloudusage where usage_type in (4,5) >>> >>> And see if there are any records for the Feb 2 are present. If there >>> are, then its an API problem - when data is present in the DB, but API >>> doesn't return it. >>> >>> If there are no records for Feb2 in the DB, can you check if CS >>> reported any usage during this day? >>> >>> Thanks, >>> Alena. >>> >>> From: Diego Spinola Castro <spinolacas...@gmail.com> >>> Date: Tuesday, March 18, 2014 at 8:46 AM >>> To: Alena Prokharchyk <alena.prokharc...@citrix.com> >>> Cc: "users@cloudstack.apache.org" <users@cloudstack.apache.org>, Kishan >>> Kavala <kishan.kav...@citrix.com> >>> Subject: Re: NetworkUsage >>> >>> HI Alena, thanks for the answer. Let me explain better my case. >>> Regarding those accounts , i have a lot of them without any instance >>> running, there's also some vrouters of that list which are even stopped. >>> I'm trying get the traffic count from usage api, i don't know why but >>> all usage type 4 is zero, there's the commands i'm running: >>> >>> Trying get UsageType=4 >>> curl " >>> http://localhost:8096/client/api?command=listUsageRecords&startdate=2014-02-01&enddate=2014-02-01&type=4 >>> " >>> Neither Type 4 nor 5 are returning. >>> >>> This used to work before upgrade to 4.2. >>> BTW i have only one management server. >>> >>> >>> >>> >>> 2014-03-17 20:54 GMT-03:00 Alena Prokharchyk < >>> alena.prokharc...@citrix.com>: >>> >>>> Diego, spent some more time on debugging your issue. Might not be >>>> related >>>> to transaction refactoring as it went in after 4.2. And records for some >>>> accounts actually got non-0 network stat: >>>> >>>> 0 network stats: >>>> >>>> mysql> select count(*) from user_statistics where net_bytes_received=0 >>>> or >>>> net_bytes_sent=0 order by account_id; >>>> +----------+ >>>> | count(*) | >>>> +----------+ >>>> | 26 | >>>> +----------+ >>>> 1 row in set (0.00 sec) >>>> >>>> Non-0 network stats >>>> >>>> mysql> select count(*) from user_statistics where net_bytes_received!=0 >>>> or >>>> net_bytes_sent!=0 order by account_id; >>>> +----------+ >>>> | count(*) | >>>> +----------+ >>>> | 207 | >>>> +----------+ >>>> 1 row in set (0.00 sec) >>>> >>>> >>>> >>>> Adding Kishan to the thread as he has more expertise in Network usage >>>> calculation and he might be able to help us. >>>> >>>> >>>> I would divide problematic accounts (Networks) that are not getting >>>> usage >>>> stats update, into 2 categories: >>>> >>>> CATEGORY #1 Networks having non-0 current_bytes* and 0 net_bytes >>>> received >>>> >>>> This situation is totally normal as net_bytes_received get updated only >>>> when VR gets stopped. Yet on Diego's setup, those entries are not >>>> counted >>>> when do network usage calculation. Here is the account having problems: >>>> >>>> mysql> select * from user_statistics where account_id=176; >>>> >>>> +-----+----------------+------------+-------------------+-----------+------ >>>> >>>> --------+------------+--------------------+----------------+--------------- >>>> ---------+--------------------+--------------------+----------------+ >>>> | id | data_center_id | account_id | public_ip_address | device_id | >>>> device_type | network_id | net_bytes_received | net_bytes_sent | >>>> current_bytes_received | current_bytes_sent | agg_bytes_received | >>>> agg_bytes_sent | >>>> >>>> +-----+----------------+------------+-------------------+-----------+------ >>>> >>>> --------+------------+--------------------+----------------+--------------- >>>> ---------+--------------------+--------------------+----------------+ >>>> | 228 | 1 | 176 | NULL | 1193 | >>>> DomainRouter | 366 | 0 | 0 | >>>> 4271398427 | 2014478889 | 0 | >>>> 0 | >>>> >>>> +-----+----------------+------------+-------------------+-----------+------ >>>> >>>> --------+------------+--------------------+----------------+--------------- >>>> ---------+--------------------+--------------------+----------------+ >>>> 1 row in set (0.00 sec) >>>> >>>> >>>> >>>> And here is the Usage log: >>>> >>>> >>>> 1. 2014-03-17 03:15:02,779 DEBUG [usage.parser.NetworkUsageParser] >>>> (Usage-Job-1:null) Parsing all Network usage events for account: 176 >>>> >>>> 2. 2014-03-17 03:15:02,779 DEBUG [usage.parser.NetworkUsageParser] >>>> (Usage-Job-1:null) No usage record (0 bytes used) generated for account: >>>> 176 >>>> >>>> >>>> >>>> >>>> Kishan, is it by design not to count current_bytes in network usage? Can >>>> you please double check the code, because it doesn't look right to me as >>>> the VR may not go into Stopped state for a while, yet we have to >>>> calculate >>>> the usage >>>> >>>> >>>> CATEGORY #2 >>>> >>>> Accounts (networks) having all 0s for >>>> >>>> net_bytes_received/net_bytes_sent/current_bytes_received/current_bytes_sent >>>> >>>> Diego, for those accounts we have to figure out whether the backend >>>> reporting any usage for the Vrs. To debug that, you have to: >>>> >>>> A) get all the VR names >>>> >>>> mysql> select name from vm_instance where removed is null and >>>> vm_type='DomainRouter' and account_id in (select account_id from >>>> user_statistics where net_bytes_received = 0 and net_bytes_sent=0 and >>>> current_bytes_received=0 and curre >>>> +-----------+ >>>> | name | >>>> +-----------+ >>>> | r-621-VM | >>>> | r-651-VM | >>>> | r-658-VM | >>>> | r-671-VM | >>>> | r-682-VM | >>>> | r-886-VM | >>>> | r-892-VM | >>>> | r-915-VM | >>>> | r-934-VM | >>>> | r-935-VM | >>>> | r-986-VM | >>>> | r-1002-VM | >>>> | r-1004-VM | >>>> | r-1030-VM | >>>> | r-1148-VM | >>>> | r-1205-VM | >>>> | r-1221-VM | >>>> +-----------+ >>>> >>>> >>>> B) Search for NetworkUsageAnswer and routerName in the log to locate >>>> statements like that. Do it on your today's log and the log(s) from >>>> couple >>>> of days before. You might also want to run some traffic >>>> >>>> >>>> 2014-03-17 14:41:12,661 DEBUG [agent.transport.Request] >>>> (DirectAgent-15:null) Seq 1-281411626: Processing: { Ans: , MgmtId: >>>> 4278190080, via: 1, Ver: v1, Flags: 10, >>>> >>>> [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-3-VM","bytesSen >>>> t":0,"bytesReceived":0,"result":true,"details":"success","wait":0}}] } >>>> >>>> >>>> If you see that the backend reports non-0 value, it means that the >>>> Management server fails to save it. If the backend reports 0, and yet >>>> you >>>> are sure that there is a network traffic generated on the routers above, >>>> there might be some error in a way we collect the traffic on the VR. >>>> >>>> >>>> Also how many management servers do you have in your setup? >>>> >>>> >>>> -Alena. >>>> >>>> >>>> >>>> On 3/17/14, 12:05 PM, "Alena Prokharchyk" <alena.prokharc...@citrix.com >>>> > >>>> wrote: >>>> >>>> >Diego, please go ahead and file a bug with Priority Critical against >>>> CS. >>>> >Looks like there is some bug in a way to commit db statement in >>>> >transaction resulting in the DB not being updated: >>>> > >>>> >2014-03-17 13:14:15,431 TRACE [db.Transaction.Transaction] >>>> >(RouterMonitor-1:null) txn: Not committing because transaction started >>>> >elsewhere: >>>> >>>> >-GenericDaoBase.update:814-ComponentInstantiationPostProcessor$Interceptor >>>> >>>> >Dispatcher.intercept:125-GenericDaoBase.update:774-ComponentInstantiationP >>>> >>>> >ostProcessor$InterceptorDispatcher.intercept:125-GenericDaoBase.update:126 >>>> >>>> >2-ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept:125- >>>> >>>> >VirtualNetworkApplianceManagerImpl$NetworkUsageTask.run:948-Executors$Runn >>>> >>>> >ableAdapter.call:471-FutureTask$Sync.innerRunAndReset:351-FutureTask.runAn >>>> >>>> >dReset:178-ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201:165- >>>> >ScheduledThreadPoolExecutor$ScheduledFutureTask.run:267 / >>>> >>>> >-VirtualNetworkApplianceManagerImpl$NetworkUsageTask.run:900-Executors$Run >>>> >>>> >nableAdapter.call:471-FutureTask$Sync.innerRunAndReset:351-FutureTask.runA >>>> >ndReset:178-ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201:165 >>>> >>>> >-ScheduledThreadPoolExecutor$ScheduledFutureTask.run:267-ThreadPoolExecuto >>>> >r.runWorker:1146-ThreadPoolExecutor$Worker.run:615-Thread.run:701 : >>>> >update, update, update, >>>> >>>> >-VirtualNetworkApplianceManagerImpl$NetworkUsageTask.run:900-Executors$Run >>>> >>>> >nableAdapter.call:471-FutureTask$Sync.innerRunAndReset:351-FutureTask.runA >>>> >ndReset:178-ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201:165 >>>> >>>> >-ScheduledThreadPoolExecutor$ScheduledFutureTask.run:267-ThreadPoolExecuto >>>> >r.runWorker:1146-ThreadPoolExecutor$Worker.run:615-Thread.run:701, >>>> > >>>> > >>>> > Attach the latest log from this email to the bug. >>>> > >>>> >Unfortunately, there is no way to workaround it on your side as its a >>>> >code issue. >>>> > >>>> >-Alena. >>>> > >>>> >From: Diego Spinola Castro >>>> ><spinolacas...@gmail.com<mailto:spinolacas...@gmail.com>> >>>> >Date: Monday, March 17, 2014 at 11:25 AM >>>> >To: Alena Prokharchyk >>>> ><alena.prokharc...@citrix.com<mailto:alena.prokharc...@citrix.com>> >>>> >Cc: "users@cloudstack.apache.org<mailto:users@cloudstack.apache.org>" >>>> ><users@cloudstack.apache.org<mailto:users@cloudstack.apache.org>> >>>> >Subject: Re: NetworkUsage >>>> > >>>> >Hi, sorry. i went lost with this huge log. >>>> > >>>> > >>>> > >>>> >2014-03-17 15:10 GMT-03:00 Alena Prokharchyk >>>> ><alena.prokharc...@citrix.com<mailto:alena.prokharc...@citrix.com>>: >>>> >Diego, the log you've provided, doesn't contain "Rolling back the >>>> >transaction" statement. You should wait for this particular error to >>>> >happen (while logging on TRACE), and then take a log snippet. >>>> > >>>> >From: Diego Spinola Castro >>>> ><spinolacas...@gmail.com<mailto:spinolacas...@gmail.com>> >>>> >Date: Monday, March 17, 2014 at 11:00 AM >>>> >To: Alena Prokharchyk >>>> ><alena.prokharc...@citrix.com<mailto:alena.prokharc...@citrix.com>> >>>> >Cc: "users@cloudstack.apache.org<mailto:users@cloudstack.apache.org>" >>>> ><users@cloudstack.apache.org<mailto:users@cloudstack.apache.org>> >>>> >Subject: Re: NetworkUsage >>>> > >>>> >Hi, Alena. thanks for the tips, i got some logs: >>>> >http://www.fpaste.org/86141/13950791/ >>>> > >>>> > >>>> >2014-03-17 14:27 GMT-03:00 Alena Prokharchyk >>>> ><alena.prokharc...@citrix.com<mailto:alena.prokharc...@citrix.com>>: >>>> >Diego, >>>> > >>>> >Unfortunately CS lock lack the needed info for debugging the problem. I >>>> >see only transaction roll back statement: >>>> > >>>> >2014-03-17 06:01:22,363 DEBUG [db.Transaction.Transaction] >>>> >(RouterMonitor-1:null) Rolling back the transaction: Time = 2 Name = >>>> >>>> >-VirtualNetworkApplianceManagerImpl$NetworkUsageTask.run:900-Executors$Run >>>> >n >>>> >>>> >ableAdapter.call:471-FutureTask$Sync.innerRunAndReset:351-FutureTask.runAn >>>> >d >>>> >>>> >Reset:178-ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201:165- >>>> > >>>> > >>>> >To debug it further, you need to change the log level from DEBUG to >>>> TRACE >>>> >for a certain class by modifying log4 properties (no MS restart is >>>> >required) >>>> > >>>> > >>>> >Here is the link on how to do it: >>>> > >>>> > >>>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Useful+Tips+and+Tri >>>> >c >>>> >ks, "How to enable trace logging for a particular class in CloudStack?² >>>> >section. Replace category name with: >>>> > >>>> ><category name=³com.cloud.utils.db"> >>>> > <priority value=³TRACE"/> >>>> > </category> >>>> > >>>> > >>>> > >>>> > >>>> > >>>> >After the change is done, wait till you see TRACE statements in the >>>> log, >>>> >and till the error "Rolling back the transaction" occurs again. Take >>>> the >>>> >log snippet and send it out. After you are done with that, you will >>>> need >>>> >to remove TRACE config from log4 properties file, otherwise your logs >>>> will >>>> >grow huge. >>>> > >>>> >-Alena. >>>> > >>>> > >>>> >On 3/17/14, 9:48 AM, "Diego Spinola Castro" >>>> ><spinolacas...@gmail.com<mailto:spinolacas...@gmail.com>> >>>> >wrote: >>>> > >>>> >>Hi guys, i've been running into a issue when getting network traffic >>>> from >>>> >>usage server. All records are Zero, i have a CS 4.2.0 install. >>>> >> >>>> >>Here the usage.logs http://www.fpaste.org/86116/13950748/ >>>> >>And management-server.log http://www.fpaste.org/86118/07486613/ >>>> >> >>>> >>Does anyone know what is wrong ? >>>> > >>>> > >>>> > >>>> >>>> >>> >> >