Hello LightDot

Thanks for the interest in my post, hope we can come up with a way to 
improve performance. This is my current setup:


   - Host: Digital Ocean (and yes I do think their Droplets (as instances 
   are called) are KVM)
   - OS: Ubuntu 13.04
   - Web Server: Apache 2.2
   - Database: PostgreSQL 9.1
   - Memory usage in peak times is about 1.4GB (out of 2GB)
   - By the way I have no swap partition
   - I don't know how to count db connections, but in my db definition I 
   used the poolsize=50, but again I don't know how to check the amount of 
   connections at any given time
   - Disk usage acording to Digital Ocean metrics is medium never high (do 
   truly I have never understand their graph)
   - CPU usage at some points gets close to 50% (since this a dual core I 
   would assume one of the cores is at 100%)

I don't know what else to mention but, if I missed anything please ask. And 
thanks again.


El viernes, 4 de abril de 2014 10:50:07 UTC-6, LightDot escribió:
>
> It would be prudent to see some numbers and to learn about your setup 
> more...
>
> So, what is your current setup like? Which web server are you using, which 
> database? How much simultaneous db connections are we talking about?
>
> Digital Ocean is using KVM virtualization, correct? How is you memory 
> usage, which OS are you using within your VPS? How bad is the i/o really..?
>
>
> On Friday, April 4, 2014 4:30:27 PM UTC+2, Francisco Betancourt wrote:
>>
>> Thanks for the response Niphlod. I believe the bottleneck is the db, I am 
>> rewriting the code to use less queries, but in general terms I think I 
>> agree with you, I need a faster server. I currently host on Digital Ocean 
>> and I'm running a dual core with 2GB in RAM server. The statistics for the 
>> server show CPU usage never reaches 100% (not even 50%),even when doing the 
>> big files, so I guess the problem is I/O. This server already use SSDs, so 
>> I was thinking of using a second db server to divide the work, so the 
>> information that I get back from the web service is stored in another db, 
>> and then I could use this db also for my scheduler tasks. 
>>
>> Do you think this makes sense?
>>
>> El jueves, 3 de abril de 2014 10:17:09 UTC-6, Niphlod escribió:
>>>
>>> I think the main point you need to assess if you have enough resources 
>>> to do what you want. If your server can't keep up with the work you're 
>>> asking it to do, then using scheduler or whatever won't bring any benefits.
>>> If instead you have a not-so-snappy user-interface because you don't 
>>> have enough power to respond quickly to a user, then the scheduler is a 
>>> right fit.
>>>
>>> If you're seeing all the "hurdle" of the system on the db backend, then 
>>> you're in the same situation as the first one, i.e. you need to "buy more 
>>> juice".
>>>
>>> On Thursday, April 3, 2014 1:33:59 AM UTC+2, Francisco Betancourt wrote:
>>>>
>>>> Hello everyone.
>>>>
>>>> I need some help defining the architecture for an application. We 
>>>> already have something in place, but the size of the the users data files 
>>>> is making the site very unresponsive. The app works as follows:
>>>>
>>>>
>>>>    1. The user uploads a csv file
>>>>    2. The user file is read to a table for further processing
>>>>    3. The user previews the data I read from his file (to see if 
>>>>    everything is ok)
>>>>    4. If data is ok he clicks a button which will make us generate an 
>>>>    xml (using some of the data he uploaded) and send the xml to a web 
>>>> service
>>>>    5. The web service is not ours, and we must send the xml files one 
>>>>    at a time (though we can send hundreds simultaneously)
>>>>    6. The web service returns data, and we store that data into the db
>>>>    7. Once done we offer a print friendly version of the final data
>>>>
>>>> So currently we are doing the following:
>>>>
>>>>    1. Receive a file, and once saved we process it with web2py built 
>>>>    in import from csv.
>>>>    2. Once data has been read we show a view with all the rows and a 
>>>>    button to start the process
>>>>    3. Using js we send groups of 20 rows at a time through ajax, to be 
>>>>    processed (to xml) and sent to the web service
>>>>    4. Each ajax call returns js code to update a progress bar
>>>>
>>>> Originally this files were suppose to consist of hundreds hardly a 
>>>> thousand row, but in the end we have files with 15000 rows and the average 
>>>> is about 4000. Incredibly the view (even with 15000 rows and js) works 
>>>> well. But the site becomes quite unresponsive (especially because there 
>>>> are 
>>>> normally a dozen or so users doing the same thing simultaneously).
>>>>
>>>> We have already done as many optimizations as we know (use 
>>>> migrate=False for db, call session.forget in ajax controller functions, 
>>>> byte compile the code) but it still is way too slow. So we are making some 
>>>> heavy changes. I want to try to use the scheduler and workers to do this 
>>>> job, but I fear it will generate even more db queries and make things 
>>>> worse. So I would like some suggestions on how to tackle this problem??
>>>>
>>>> Is scheduler the way to go? An if so should I make a task for every xml 
>>>> and webservice call (this seems like a bad idea to me). Or should I group 
>>>> them in chunks of an arbitrary size (e.g. 50)? And if I make this change 
>>>> will I be able to display the progress of the process?
>>>>
>>>> Thanks for all the help. Good day.
>>>>
>>>

-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to web2py+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to