I see a similar thing happening all the time. I get around it by closing the current connection and reconnecting after a sleep. Although I am able to do quite a few inserts between errors, so I'm not sure if it's the exact problem.
-Anthony On Thu, Oct 15, 2009 at 11:26:08AM -0400, Eric Lubow wrote: > Using the Thrift Perl API into Cassandra, I am running into what is > endearingly referred to as the 4 bytes of doom: > TSocket: timed out reading 4 bytes from localhost:9160 > > The script I am using is fairly simple. I have a text file that has about > 3.6 million lines that are formatted like: [email protected] 1234 > > The Cassandra dataset is a single column family called Users in the Mailings > keyspace with a data layout of: > Users = { > '[email protected]': { > email: '[email protected]', > person_id: '123456', > send_dates_2009-09-30: '2245', > send_dates_2009-10-01: '2247', > }, > } > There are about 3.5 million rows in the Users column family and each row has > no more than 4 columns (listed above). Some only have 3 (one of the > send_dates_YYYY-MM-DD isn't there). > > The script parses it and then connects to Cassandra and does a get_slice and > counts the return values adding that to a hash: > my ($value) = $client->get_slice( > 'Mailings', > $email, > Cassandra::ColumnParent->new({ > column_family => 'Users', > }), > Cassandra::SlicePredicate->new({ > slice_range => Cassandra::SliceRange->new({ > start => 'send_dates_2009-09-29', > finish => 'send_dates_2009-10-30', > }), > }), > Cassandra::ConsistencyLevel::ONE > ); > $counter{($#{$value} + 1)}++; > > For the most part, this script times out after 1 minute or so. Replacing the > get_slice with a get_count, I can get it to about 2 million queries before I > get the timeout. Replacing the get_slice with a get, I make it to about 2.5 > million before I get the timeout. The only way I could get it to run all > the way through was to add a 1/100 of a second sleep during every iteration. > I was able to get the script to complete when I shut down everything else > on the machine (and it took 177m to complete). But since this is a > semi-production machine, I had to turn everything back on afterwards. > > So for poops and laughs (at the recommendation of jbellis), I rewrote the > script in Python and it has since run (using get_slice) 3 times fully > without timing out (approximately 130m in Python) with everything else > running on the machine. > > My question is, having seen this same thing in the PHP API and it is my > understanding that the Perl API was based on the PHP API, could > http://issues.apache.org/jira/browse/THRIFT-347 apply to Perl here too? Is > anyone else seeing this issue? If so, have you gotten around it? > > Thanks. > > -e -- ------------------------------------------------------------------------ Anthony Molinaro <[email protected]>
