I'm not sure I'm out of memory per se. It just feels like I'm
incurring a huge cost going out to the DB row-by-row when the system
could be doing a batch SELECT from the DB and calculating/caching
locally. But really, I'm not sure.

Is a UserSimilarity approach expected to be this slow with the amount
of data I have? Is an item based approached preferable when
considering speed?

On Tue, Jul 12, 2011 at 11:00 PM, Lance Norskog <[email protected]> wrote:
> Mysql has some quirk about reading in batches. See this in the Solr
> wiki about it:
>
> http://wiki.apache.org/solr/DataImportHandlerFaq?highlight=%28mysql%29#I.27m_using_DataImportHandler_with_a_MySQL_database._My_table_is_huge_and_DataImportHandler_is_going_out_of_memory._Why_does_DataImportHandler_bring_everything_to_memory.3F
>
> I don't know how to set special properties in the JDBC data source.
>
> On Tue, Jul 12, 2011 at 10:09 PM, Salil Apte <[email protected]> wrote:
>> Oh yea, at runtime, I'm getting back a BasicDataSource object for my
>> DataSource. Is that correct?
>>
>> On Tue, Jul 12, 2011 at 9:59 PM, Salil Apte <[email protected]> wrote:
>>> So I started actually looking at performance today and it is pretty
>>> horrendous. I've got about 61,000 rows in my database which I'm
>>> assuming isn't *that* many rows. But recommendations are taking > 20
>>> seconds. Is there some way to ensure pooling is turned on? What else
>>> is a big driver for performance? My tables are setup so that I have a
>>> multiple index (for uniqueness) for <user_id, item_id> pairs. That
>>> way, there cannot be two entries with the same <user_id, item_id>. I'm
>>> not sure where to go from here.
>>>
>>> Thanks for the help!
>>>
>>> On Tue, Jul 12, 2011 at 12:47 AM, Sean Owen <[email protected]> wrote:
>>>> You can ignore it. It just doesn't know for sure you have a pool.
>>>> I believe I have even removed this in a recent refactoring.
>>>>
>>>> On Tue, Jul 12, 2011 at 2:21 AM, Salil Apte <[email protected]> wrote:
>>>>
>>>>> So I keep getting this warning from either Mahout or the server (I'm
>>>>> guessing the former):
>>>>>
>>>>> WARNING: You are not using ConnectionPoolDataSource. Make sure your
>>>>> DataSource pools connections to the database itself, or database
>>>>> performance will be severely reduced.
>>>>>
>>>>> I'm not really sure why this is happening. I have the following
>>>>> resource in my webapp's context.xml file. Is there anything else I
>>>>> need to do enable connection pooling with a  JNDI resource?
>>>>>
>>>>> <Resource name="jdbc/offline-local" auth="Container"
>>>>> type="javax.sql.DataSource" username="root" password=""
>>>>> driverClassName="com.mysql.jdbc.Driver"
>>>>>
>>>>> url="jdbc:mysql://localhost:3306/offlinedevel?autoReconnect=true&amp;cachePreparedStatements=true&amp;cachePrepStmts=true&amp;cacheResultSetMetadata=true&amp;alwaysSendSetIsolation=false&amp;elideSetAutoCommits=true"
>>>>> validationQuery="select 1" maxActive="16" maxIdle="4"
>>>>> removeAbandoned="true" logAbandoned="true" />
>>>>>
>>>>> Thanks in advance.
>>>>>
>>>>> -Salil
>>>>>
>>>>
>>>
>>
>
>
>
> --
> Lance Norskog
> [email protected]
>

Reply via email to