Re: COPY TO export fails with

2016-05-10 Thread Stefania Alborghetti
For COPY TO you can try increasing the page timeout or decreasing the page
size:

PAGETIMEOUT=10   - the page timeout in seconds for fetching results
PAGESIZE='1000'  - the page size for fetching results

You can pass these options to the COPY command by adding "WITH
PAGETIMEOUT=1000;", for example.

It will be slower than Spark but to improve performance you can install the
Python driver with Cython extensions as explained in the Setup section of this
blog
.
The blog also explains how to compile the copy module itself with Cython.
This is not as important as compiling the driver, and on some versions you
may hit CASSANDRA-11574
.



On Tue, May 10, 2016 at 6:39 PM, Matthias Niehoff <
matthias.nieh...@codecentric.de> wrote:

> Hi,
>
> already that copy to might not be the best way to do this. I’ll write a
> small spark job.
>
> Thanks
>
> 2016-05-10 10:36 GMT+02:00 Carlos Rolo :
>
>> Hello,
>>
>> That is a lot of data to do an "COPY TO.
>>
>> If you want a fast way to export, and you're fine with Java, you can use
>> Cassandra SSTableReader classes to read the sstables directly. Spark also
>> works.
>>
>> Regards,
>>
>> Carlos Juzarte Rolo
>> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>>
>> Pythian - Love your data
>>
>> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
>> *linkedin.com/in/carlosjuzarterolo
>> *
>> Mobile: +351 918 918 100
>> www.pythian.com
>>
>> On Tue, May 10, 2016 at 9:33 AM, Matthias Niehoff <
>> matthias.nieh...@codecentric.de> wrote:
>>
>>> sry, sent early..
>>>
>>> more errors:
>>>
>>> /export.cql:9:Error for (4549395184516451179, 4560441269902768904): 
>>> NoHostAvailable - ('Unable to complete the operation against any hosts', 
>>> {: ConnectionException('Host has been marked 
>>> down or removed',)}) (will try again later attempt 1 of 5)
>>> /export.cql:9:Error for (-2083690356124961461, -2068514534992400755): 
>>> NoHostAvailable - ('Unable to complete the operation against any hosts', 
>>> {}) (will try again later attempt 1 of 5)
>>> /export.cql:9:Error for (-4899866517058128956, -4897773268483324406): 
>>> NoHostAvailable - ('Unable to complete the operation against any hosts', 
>>> {}) (will try again later attempt 1 of 5)
>>> /export.cql:9:Error for (-1435092096023471089, -1434747957681478442): 
>>> NoHostAvailable - ('Unable to complete the operation against any hosts', 
>>> {}) (will try again later attempt 1 of 5)
>>> /export.cql:9:Error for (-2804962318029794069, -2783747272192843127): 
>>> NoHostAvailable - ('Unable to complete the operation against any hosts', 
>>> {}) (will try again later attempt 1 of 5)
>>> /export.cql:9:Error for (-5188633782964403059, -5149722481923709224): 
>>> NoHostAvailable - (‚Unable to complete the operation against any hosts', 
>>> {}) (will try again later attempt 1 of 5)
>>>
>>>
>>>
>>> It looks like the cluster can not handle export and the nodes cannot handle 
>>> the export.
>>>
>>> Is the cqlsh copy able to export this amount of data? or should other 
>>> methods be used (sstableloader, some custom code, spark…)
>>>
>>>
>>> Best regards
>>>
>>>
>>> 2016-05-10 10:29 GMT+02:00 Matthias Niehoff <
>>> matthias.nieh...@codecentric.de>:
>>>
 Hi,

 i try to export data of a table (~15GB) using the cqlsh copy to. It
 fails with „no host available“. If I try it with a smaller table everything
 works fine.

 The statistics of the big table:

 SSTable count: 81
 Space used (live): 14102945336
 Space used (total): 14102945336
 Space used by snapshots (total): 62482577
 Off heap memory used (total): 16399540
 SSTable Compression Ratio: 0.1863544514417909
 Number of keys (estimate): 5034845
 Memtable cell count: 5590
 Memtable data size: 18579542
 Memtable off heap memory used: 0
 Memtable switch count: 72
 Local read count: 0
 Local read latency: NaN ms
 Local write count: 139878
 Local write latency: 0.023 ms
 Pending flushes: 0
 Bloom filter false positives: 0
 Bloom filter false ratio: 0.0
 Bloom filter space used: 6224240
 Bloom filter off heap memory used: 6223592
 Index summary off heap memory used: 1098860
 Compression metadata off heap memory used: 9077088
 Compacted partition minimum bytes: 373
 Compacted partition maximum bytes: 1358102
 Compacted partition mean 

Re: COPY TO export fails with

2016-05-10 Thread Matthias Niehoff
Hi,

already that copy to might not be the best way to do this. I’ll write a
small spark job.

Thanks

2016-05-10 10:36 GMT+02:00 Carlos Rolo :

> Hello,
>
> That is a lot of data to do an "COPY TO.
>
> If you want a fast way to export, and you're fine with Java, you can use
> Cassandra SSTableReader classes to read the sstables directly. Spark also
> works.
>
> Regards,
>
> Carlos Juzarte Rolo
> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>
> Pythian - Love your data
>
> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
> *linkedin.com/in/carlosjuzarterolo
> *
> Mobile: +351 918 918 100
> www.pythian.com
>
> On Tue, May 10, 2016 at 9:33 AM, Matthias Niehoff <
> matthias.nieh...@codecentric.de> wrote:
>
>> sry, sent early..
>>
>> more errors:
>>
>> /export.cql:9:Error for (4549395184516451179, 4560441269902768904): 
>> NoHostAvailable - ('Unable to complete the operation against any hosts', 
>> {: ConnectionException('Host has been marked 
>> down or removed',)}) (will try again later attempt 1 of 5)
>> /export.cql:9:Error for (-2083690356124961461, -2068514534992400755): 
>> NoHostAvailable - ('Unable to complete the operation against any hosts', {}) 
>> (will try again later attempt 1 of 5)
>> /export.cql:9:Error for (-4899866517058128956, -4897773268483324406): 
>> NoHostAvailable - ('Unable to complete the operation against any hosts', {}) 
>> (will try again later attempt 1 of 5)
>> /export.cql:9:Error for (-1435092096023471089, -1434747957681478442): 
>> NoHostAvailable - ('Unable to complete the operation against any hosts', {}) 
>> (will try again later attempt 1 of 5)
>> /export.cql:9:Error for (-2804962318029794069, -2783747272192843127): 
>> NoHostAvailable - ('Unable to complete the operation against any hosts', {}) 
>> (will try again later attempt 1 of 5)
>> /export.cql:9:Error for (-5188633782964403059, -5149722481923709224): 
>> NoHostAvailable - (‚Unable to complete the operation against any hosts', {}) 
>> (will try again later attempt 1 of 5)
>>
>>
>>
>> It looks like the cluster can not handle export and the nodes cannot handle 
>> the export.
>>
>> Is the cqlsh copy able to export this amount of data? or should other 
>> methods be used (sstableloader, some custom code, spark…)
>>
>>
>> Best regards
>>
>>
>> 2016-05-10 10:29 GMT+02:00 Matthias Niehoff <
>> matthias.nieh...@codecentric.de>:
>>
>>> Hi,
>>>
>>> i try to export data of a table (~15GB) using the cqlsh copy to. It
>>> fails with „no host available“. If I try it with a smaller table everything
>>> works fine.
>>>
>>> The statistics of the big table:
>>>
>>> SSTable count: 81
>>> Space used (live): 14102945336
>>> Space used (total): 14102945336
>>> Space used by snapshots (total): 62482577
>>> Off heap memory used (total): 16399540
>>> SSTable Compression Ratio: 0.1863544514417909
>>> Number of keys (estimate): 5034845
>>> Memtable cell count: 5590
>>> Memtable data size: 18579542
>>> Memtable off heap memory used: 0
>>> Memtable switch count: 72
>>> Local read count: 0
>>> Local read latency: NaN ms
>>> Local write count: 139878
>>> Local write latency: 0.023 ms
>>> Pending flushes: 0
>>> Bloom filter false positives: 0
>>> Bloom filter false ratio: 0.0
>>> Bloom filter space used: 6224240
>>> Bloom filter off heap memory used: 6223592
>>> Index summary off heap memory used: 1098860
>>> Compression metadata off heap memory used: 9077088
>>> Compacted partition minimum bytes: 373
>>> Compacted partition maximum bytes: 1358102
>>> Compacted partition mean bytes: 16252
>>> Average live cells per slice (last five minutes): 0.0
>>> Maximum live cells per slice (last five minutes): 0.0
>>> Average tombstones per slice (last five minutes): 0.0
>>> Maximum tombstones per slice (last five minutes): 0.0
>>>
>>>
>>> Some of the errors:
>>>
>>> /export.cql:9:Error for (269754647900342974, 272655475232221549): 
>>> OperationTimedOut - errors={}, last_host=10.1.12.89 (will try again later 
>>> attempt 1 of 5)
>>> /export.cql:9:Error for (-3191598516608295217, -3188807168672208162): 
>>> OperationTimedOut - errors={}, last_host=10.1.12.89 (will try again later 
>>> attempt 1 of 5)
>>> /export.cql:9:Error for (-3066009427947359685, -3058745599093267591): 
>>> OperationTimedOut - errors={}, last_host=10.1.8.5 (will try again later 
>>> attempt 1 of 5)
>>> /export.cql:9:Error for (-1737068099173540127, -1716693115263588178): 
>>> OperationTimedOut - errors={}, last_host=10.1.8.5 (will try 

Re: COPY TO export fails with

2016-05-10 Thread Carlos Rolo
Hello,

That is a lot of data to do an "COPY TO.

If you want a fast way to export, and you're fine with Java, you can use
Cassandra SSTableReader classes to read the sstables directly. Spark also
works.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP

Pythian - Love your data

rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
*linkedin.com/in/carlosjuzarterolo
*
Mobile: +351 918 918 100
www.pythian.com

On Tue, May 10, 2016 at 9:33 AM, Matthias Niehoff <
matthias.nieh...@codecentric.de> wrote:

> sry, sent early..
>
> more errors:
>
> /export.cql:9:Error for (4549395184516451179, 4560441269902768904): 
> NoHostAvailable - ('Unable to complete the operation against any hosts', 
> {: ConnectionException('Host has been marked down 
> or removed',)}) (will try again later attempt 1 of 5)
> /export.cql:9:Error for (-2083690356124961461, -2068514534992400755): 
> NoHostAvailable - ('Unable to complete the operation against any hosts', {}) 
> (will try again later attempt 1 of 5)
> /export.cql:9:Error for (-4899866517058128956, -4897773268483324406): 
> NoHostAvailable - ('Unable to complete the operation against any hosts', {}) 
> (will try again later attempt 1 of 5)
> /export.cql:9:Error for (-1435092096023471089, -1434747957681478442): 
> NoHostAvailable - ('Unable to complete the operation against any hosts', {}) 
> (will try again later attempt 1 of 5)
> /export.cql:9:Error for (-2804962318029794069, -2783747272192843127): 
> NoHostAvailable - ('Unable to complete the operation against any hosts', {}) 
> (will try again later attempt 1 of 5)
> /export.cql:9:Error for (-5188633782964403059, -5149722481923709224): 
> NoHostAvailable - (‚Unable to complete the operation against any hosts', {}) 
> (will try again later attempt 1 of 5)
>
>
>
> It looks like the cluster can not handle export and the nodes cannot handle 
> the export.
>
> Is the cqlsh copy able to export this amount of data? or should other methods 
> be used (sstableloader, some custom code, spark…)
>
>
> Best regards
>
>
> 2016-05-10 10:29 GMT+02:00 Matthias Niehoff <
> matthias.nieh...@codecentric.de>:
>
>> Hi,
>>
>> i try to export data of a table (~15GB) using the cqlsh copy to. It fails
>> with „no host available“. If I try it with a smaller table everything works
>> fine.
>>
>> The statistics of the big table:
>>
>> SSTable count: 81
>> Space used (live): 14102945336
>> Space used (total): 14102945336
>> Space used by snapshots (total): 62482577
>> Off heap memory used (total): 16399540
>> SSTable Compression Ratio: 0.1863544514417909
>> Number of keys (estimate): 5034845
>> Memtable cell count: 5590
>> Memtable data size: 18579542
>> Memtable off heap memory used: 0
>> Memtable switch count: 72
>> Local read count: 0
>> Local read latency: NaN ms
>> Local write count: 139878
>> Local write latency: 0.023 ms
>> Pending flushes: 0
>> Bloom filter false positives: 0
>> Bloom filter false ratio: 0.0
>> Bloom filter space used: 6224240
>> Bloom filter off heap memory used: 6223592
>> Index summary off heap memory used: 1098860
>> Compression metadata off heap memory used: 9077088
>> Compacted partition minimum bytes: 373
>> Compacted partition maximum bytes: 1358102
>> Compacted partition mean bytes: 16252
>> Average live cells per slice (last five minutes): 0.0
>> Maximum live cells per slice (last five minutes): 0.0
>> Average tombstones per slice (last five minutes): 0.0
>> Maximum tombstones per slice (last five minutes): 0.0
>>
>>
>> Some of the errors:
>>
>> /export.cql:9:Error for (269754647900342974, 272655475232221549): 
>> OperationTimedOut - errors={}, last_host=10.1.12.89 (will try again later 
>> attempt 1 of 5)
>> /export.cql:9:Error for (-3191598516608295217, -3188807168672208162): 
>> OperationTimedOut - errors={}, last_host=10.1.12.89 (will try again later 
>> attempt 1 of 5)
>> /export.cql:9:Error for (-3066009427947359685, -3058745599093267591): 
>> OperationTimedOut - errors={}, last_host=10.1.8.5 (will try again later 
>> attempt 1 of 5)
>> /export.cql:9:Error for (-1737068099173540127, -1716693115263588178): 
>> OperationTimedOut - errors={}, last_host=10.1.8.5 (will try again later 
>> attempt 1 of 5)
>> /export.cql:9:Error for (-655042025062419794, -627527938552757160): 
>> OperationTimedOut - errors={}, last_host=10.1.12.89 (will try again later 
>> attempt 1 of 5)
>> /export.cql:9:Error for (2441403877625910843, 2445504271098651532): 
>> OperationTimedOut - 

Re: COPY TO export fails with

2016-05-10 Thread Matthias Niehoff
sry, sent early..

more errors:

/export.cql:9:Error for (4549395184516451179, 4560441269902768904):
NoHostAvailable - ('Unable to complete the operation against any
hosts', {: ConnectionException('Host has
been marked down or removed',)}) (will try again later attempt 1 of 5)
/export.cql:9:Error for (-2083690356124961461, -2068514534992400755):
NoHostAvailable - ('Unable to complete the operation against any
hosts', {}) (will try again later attempt 1 of 5)
/export.cql:9:Error for (-4899866517058128956, -4897773268483324406):
NoHostAvailable - ('Unable to complete the operation against any
hosts', {}) (will try again later attempt 1 of 5)
/export.cql:9:Error for (-1435092096023471089, -1434747957681478442):
NoHostAvailable - ('Unable to complete the operation against any
hosts', {}) (will try again later attempt 1 of 5)
/export.cql:9:Error for (-2804962318029794069, -2783747272192843127):
NoHostAvailable - ('Unable to complete the operation against any
hosts', {}) (will try again later attempt 1 of 5)
/export.cql:9:Error for (-5188633782964403059, -5149722481923709224):
NoHostAvailable - (‚Unable to complete the operation against any
hosts', {}) (will try again later attempt 1 of 5)



It looks like the cluster can not handle export and the nodes cannot
handle the export.

Is the cqlsh copy able to export this amount of data? or should other
methods be used (sstableloader, some custom code, spark…)


Best regards


2016-05-10 10:29 GMT+02:00 Matthias Niehoff :

> Hi,
>
> i try to export data of a table (~15GB) using the cqlsh copy to. It fails
> with „no host available“. If I try it with a smaller table everything works
> fine.
>
> The statistics of the big table:
>
> SSTable count: 81
> Space used (live): 14102945336
> Space used (total): 14102945336
> Space used by snapshots (total): 62482577
> Off heap memory used (total): 16399540
> SSTable Compression Ratio: 0.1863544514417909
> Number of keys (estimate): 5034845
> Memtable cell count: 5590
> Memtable data size: 18579542
> Memtable off heap memory used: 0
> Memtable switch count: 72
> Local read count: 0
> Local read latency: NaN ms
> Local write count: 139878
> Local write latency: 0.023 ms
> Pending flushes: 0
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 6224240
> Bloom filter off heap memory used: 6223592
> Index summary off heap memory used: 1098860
> Compression metadata off heap memory used: 9077088
> Compacted partition minimum bytes: 373
> Compacted partition maximum bytes: 1358102
> Compacted partition mean bytes: 16252
> Average live cells per slice (last five minutes): 0.0
> Maximum live cells per slice (last five minutes): 0.0
> Average tombstones per slice (last five minutes): 0.0
> Maximum tombstones per slice (last five minutes): 0.0
>
>
> Some of the errors:
>
> /export.cql:9:Error for (269754647900342974, 272655475232221549): 
> OperationTimedOut - errors={}, last_host=10.1.12.89 (will try again later 
> attempt 1 of 5)
> /export.cql:9:Error for (-3191598516608295217, -3188807168672208162): 
> OperationTimedOut - errors={}, last_host=10.1.12.89 (will try again later 
> attempt 1 of 5)
> /export.cql:9:Error for (-3066009427947359685, -3058745599093267591): 
> OperationTimedOut - errors={}, last_host=10.1.8.5 (will try again later 
> attempt 1 of 5)
> /export.cql:9:Error for (-1737068099173540127, -1716693115263588178): 
> OperationTimedOut - errors={}, last_host=10.1.8.5 (will try again later 
> attempt 1 of 5)
> /export.cql:9:Error for (-655042025062419794, -627527938552757160): 
> OperationTimedOut - errors={}, last_host=10.1.12.89 (will try again later 
> attempt 1 of 5)
> /export.cql:9:Error for (2441403877625910843, 2445504271098651532): 
> OperationTimedOut - errors={}, last_host=10.1.12.89 (permanently given up 
> after 1000 rows and 1 attempts)
>
>
> …
>
>
>
> --
> Matthias Niehoff | IT-Consultant | Agile Software Factory  | Consulting
> codecentric AG | Zeppelinstr 2 | 76185 Karlsruhe | Deutschland
> tel: +49 (0) 721.9595-681 | fax: +49 (0) 721.9595-666 | mobil: +49 (0)
> 172.1702676
> www.codecentric.de | blog.codecentric.de | www.meettheexperts.de |
> www.more4fi.de
>
> Sitz der Gesellschaft: Solingen | HRB 25917| Amtsgericht Wuppertal
> Vorstand: Michael Hochgürtel . Mirko Novakovic . Rainer Vehns
> Aufsichtsrat: Patric Fedlmeier (Vorsitzender) . Klaus Jäger . Jürgen Schütz
>
> Diese E-Mail einschließlich evtl. beigefügter Dateien enthält vertrauliche
> und/oder rechtlich geschützte 

COPY TO export fails with

2016-05-10 Thread Matthias Niehoff
Hi,

i try to export data of a table (~15GB) using the cqlsh copy to. It fails
with „no host available“. If I try it with a smaller table everything works
fine.

The statistics of the big table:

SSTable count: 81
Space used (live): 14102945336
Space used (total): 14102945336
Space used by snapshots (total): 62482577
Off heap memory used (total): 16399540
SSTable Compression Ratio: 0.1863544514417909
Number of keys (estimate): 5034845
Memtable cell count: 5590
Memtable data size: 18579542
Memtable off heap memory used: 0
Memtable switch count: 72
Local read count: 0
Local read latency: NaN ms
Local write count: 139878
Local write latency: 0.023 ms
Pending flushes: 0
Bloom filter false positives: 0
Bloom filter false ratio: 0.0
Bloom filter space used: 6224240
Bloom filter off heap memory used: 6223592
Index summary off heap memory used: 1098860
Compression metadata off heap memory used: 9077088
Compacted partition minimum bytes: 373
Compacted partition maximum bytes: 1358102
Compacted partition mean bytes: 16252
Average live cells per slice (last five minutes): 0.0
Maximum live cells per slice (last five minutes): 0.0
Average tombstones per slice (last five minutes): 0.0
Maximum tombstones per slice (last five minutes): 0.0


Some of the errors:

/export.cql:9:Error for (269754647900342974, 272655475232221549):
OperationTimedOut - errors={}, last_host=10.1.12.89 (will try again
later attempt 1 of 5)
/export.cql:9:Error for (-3191598516608295217, -3188807168672208162):
OperationTimedOut - errors={}, last_host=10.1.12.89 (will try again
later attempt 1 of 5)
/export.cql:9:Error for (-3066009427947359685, -3058745599093267591):
OperationTimedOut - errors={}, last_host=10.1.8.5 (will try again
later attempt 1 of 5)
/export.cql:9:Error for (-1737068099173540127, -1716693115263588178):
OperationTimedOut - errors={}, last_host=10.1.8.5 (will try again
later attempt 1 of 5)
/export.cql:9:Error for (-655042025062419794, -627527938552757160):
OperationTimedOut - errors={}, last_host=10.1.12.89 (will try again
later attempt 1 of 5)
/export.cql:9:Error for (2441403877625910843, 2445504271098651532):
OperationTimedOut - errors={}, last_host=10.1.12.89 (permanently given
up after 1000 rows and 1 attempts)


…



-- 
Matthias Niehoff | IT-Consultant | Agile Software Factory  | Consulting
codecentric AG | Zeppelinstr 2 | 76185 Karlsruhe | Deutschland
tel: +49 (0) 721.9595-681 | fax: +49 (0) 721.9595-666 | mobil: +49 (0)
172.1702676
www.codecentric.de | blog.codecentric.de | www.meettheexperts.de |
www.more4fi.de

Sitz der Gesellschaft: Solingen | HRB 25917| Amtsgericht Wuppertal
Vorstand: Michael Hochgürtel . Mirko Novakovic . Rainer Vehns
Aufsichtsrat: Patric Fedlmeier (Vorsitzender) . Klaus Jäger . Jürgen Schütz

Diese E-Mail einschließlich evtl. beigefügter Dateien enthält vertrauliche
und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige
Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie
bitte sofort den Absender und löschen Sie diese E-Mail und evtl.
beigefügter Dateien umgehend. Das unerlaubte Kopieren, Nutzen oder Öffnen
evtl. beigefügter Dateien sowie die unbefugte Weitergabe dieser E-Mail ist
nicht gestattet