[jira] [Comment Edited] (CASSANDRA-14466) Enable Direct I/O

2018-06-07 Thread Brian O'Neill (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16504784#comment-16504784
 ] 

Brian O'Neill edited comment on CASSANDRA-14466 at 6/7/18 3:57 PM:
---

Although I don't know how Cassandra behaves in this case, my own experiments 
with O_DIRECT have shown that it only improves performance in very few 
applications. In particular, applications which do very little computation (low 
CPU load) and are accessing only the fastest available SSDs. Cassandra tends to 
be a bit heavy on the CPU load, and so I'm skeptical that switching to O_DIRECT 
by itself is the reason performance improved. In addition, I've only seen 
significant improvement with O_DIRECT when bypassing the file system and 
accessing the block device directly.

As was already suggested, switching off read-ahead might be the reason why 
you're seeing improved performance. Although the file system is generally able 
to adapt to usage patterns, I've found that explicitly setting 
POSIX_FADV_RANDOM really helps for random access workloads. This behavior is 
implicit with O_DIRECT. Although I see that Cassandra has utility code to call 
fadvise, only the POSIX_FADV_DONTNEED option ever appears to be used.

Considering that the test machine had 128GB of RAM, you really want the OS to 
manage the cache instead of Cassandra. How large was the JVM heap when running 
the test? Far larger data sizes, caching more data in the Java heap will lead 
to GC problems.


was (Author: bronee):
Although I don't know how Cassandra behaves in this case, my own experiments 
with O_DIRECT have show that it only improves performance in very few 
applications. In particular, applications which do very little computation (low 
CPU load) and are accessing only the fastest available SSDs. Cassandra tends to 
be a bit heavy on the CPU load, and so I'm skeptical that switching to O_DIRECT 
by itself is the reason performance improved. In addition, I've only seen 
significant improvement with O_DIRECT when bypassing the file system and 
accessing the block device directly.

As was already suggested, switching off read-ahead might be the reason why 
you're seeing improved performance. Although the file system is generally able 
to adapt to usage patterns, I've found that explicitly setting 
POSIX_FADV_RANDOM really helps for random access workloads. This behavior is 
implicit with O_DIRECT. Although I see that Cassandra has utility code to call 
fadvise, only the POSIX_FADV_DONTNEED option ever appears to be used.

Considering that the test machine had 128GB of RAM, you really want the OS to 
manage the cache instead of Cassandra. How large was the JVM heap when running 
the test? Far larger data sizes, caching more data in the Java heap will lead 
to GC problems.

> Enable Direct I/O 
> --
>
> Key: CASSANDRA-14466
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14466
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local Write-Read Paths
>Reporter: Mulugeta Mammo
>Priority: Major
> Attachments: direct_io.patch
>
>
> Hi,
> JDK 10 introduced a new API for Direct IO that enables applications to bypass 
> the file system cache and potentially improve performance. Details of this 
> feature can be found at [https://bugs.openjdk.java.net/browse/JDK-8164900].
> This patch uses the JDK 10 API to enable Direct IO for the Cassandra read 
> path. By default, we have disabled this feature; but it can be enabled using 
> a  new configuration parameter, enable_direct_io_for_read_path. We have 
> conducted a Cassandra read-only stress test and measured a throughput gain of 
> up to 60% on flash drives.
> The patch requires JDK 10 Cassandra Support - 
> https://issues.apache.org/jira/browse/CASSANDRA-9608 
> Please review the patch and let us know your feedback.
> Thanks,
> [^direct_io.patch]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14466) Enable Direct I/O

2018-06-07 Thread Brian O'Neill (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16504784#comment-16504784
 ] 

Brian O'Neill commented on CASSANDRA-14466:
---

Although I don't know how Cassandra behaves in this case, my own experiments 
with O_DIRECT have show that it only improves performance in very few 
applications. In particular, applications which do very little computation (low 
CPU load) and are accessing only the fastest available SSDs. Cassandra tends to 
be a bit heavy on the CPU load, and so I'm skeptical that switching to O_DIRECT 
by itself is the reason performance improved. In addition, I've only seen 
significant improvement with O_DIRECT when bypassing the file system and 
accessing the block device directly.

As was already suggested, switching off read-ahead might be the reason why 
you're seeing improved performance. Although the file system is generally able 
to adapt to usage patterns, I've found that explicitly setting 
POSIX_FADV_RANDOM really helps for random access workloads. This behavior is 
implicit with O_DIRECT. Although I see that Cassandra has utility code to call 
fadvise, only the POSIX_FADV_DONTNEED option ever appears to be used.

Considering that the test machine had 128GB of RAM, you really want the OS to 
manage the cache instead of Cassandra. How large was the JVM heap when running 
the test? Far larger data sizes, caching more data in the Java heap will lead 
to GC problems.

> Enable Direct I/O 
> --
>
> Key: CASSANDRA-14466
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14466
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local Write-Read Paths
>Reporter: Mulugeta Mammo
>Priority: Major
> Attachments: direct_io.patch
>
>
> Hi,
> JDK 10 introduced a new API for Direct IO that enables applications to bypass 
> the file system cache and potentially improve performance. Details of this 
> feature can be found at [https://bugs.openjdk.java.net/browse/JDK-8164900].
> This patch uses the JDK 10 API to enable Direct IO for the Cassandra read 
> path. By default, we have disabled this feature; but it can be enabled using 
> a  new configuration parameter, enable_direct_io_for_read_path. We have 
> conducted a Cassandra read-only stress test and measured a throughput gain of 
> up to 60% on flash drives.
> The patch requires JDK 10 Cassandra Support - 
> https://issues.apache.org/jira/browse/CASSANDRA-9608 
> Please review the patch and let us know your feedback.
> Thanks,
> [^direct_io.patch]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-10699) Make schema alterations strongly consistent

2018-04-24 Thread Brian O'Neill (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451647#comment-16451647
 ] 

Brian O'Neill edited comment on CASSANDRA-10699 at 4/25/18 4:23 AM:


With this change, will schema alterations be versioned? I think this would be 
helpful when plugging in other storage engines. 
[CASSANDRA-13474|https://issues.apache.org/jira/browse/CASSANDRA-13474]


was (Author: bronee):
With this change, will schema alterations be versioned? I think this would be 
helpful when plugging in other storage engines. 
https://issues.apache.org/jira/browse/CASSANDRA-13474

> Make schema alterations strongly consistent
> ---
>
> Key: CASSANDRA-10699
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10699
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
>Priority: Major
> Fix For: 4.0
>
>
> Schema changes do not necessarily commute. This has been the case before 
> CASSANDRA-5202, but now is particularly problematic.
> We should employ a strongly consistent protocol instead of relying on 
> marshalling {{Mutation}} objects with schema changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-10699) Make schema alterations strongly consistent

2018-04-24 Thread Brian O'Neill (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451647#comment-16451647
 ] 

Brian O'Neill edited comment on CASSANDRA-10699 at 4/25/18 4:22 AM:


With this change, will schema alterations be versioned? I think this would be 
helpful when plugging in other storage engines. 
https://issues.apache.org/jira/browse/CASSANDRA-13474


was (Author: bronee):
With this change, will schema alterations be versioned? I think this would be 
helpful when plugging in other storage engines. [#CASSANDRA-13474]

> Make schema alterations strongly consistent
> ---
>
> Key: CASSANDRA-10699
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10699
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
>Priority: Major
> Fix For: 4.0
>
>
> Schema changes do not necessarily commute. This has been the case before 
> CASSANDRA-5202, but now is particularly problematic.
> We should employ a strongly consistent protocol instead of relying on 
> marshalling {{Mutation}} objects with schema changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-10699) Make schema alterations strongly consistent

2018-04-24 Thread Brian O'Neill (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451647#comment-16451647
 ] 

Brian O'Neill commented on CASSANDRA-10699:
---

With this change, will schema alterations be versioned? I think this would be 
helpful when plugging in other storage engines. [#CASSANDRA-13474]

> Make schema alterations strongly consistent
> ---
>
> Key: CASSANDRA-10699
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10699
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
>Priority: Major
> Fix For: 4.0
>
>
> Schema changes do not necessarily commute. This has been the case before 
> CASSANDRA-5202, but now is particularly problematic.
> We should employ a strongly consistent protocol instead of relying on 
> marshalling {{Mutation}} objects with schema changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13475) Pluggable storage engine design

2018-04-12 Thread Brian O'Neill (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436701#comment-16436701
 ] 

Brian O'Neill commented on CASSANDRA-13475:
---

 A "storage engine" is analogous to the storage engine concept of MySQL. InnoDB 
and MyISAM are MySQL storage engines, and a few others exist as well. A driver 
swap is analogous to swapping MySQL for PostgreSQL, which aren't fully 
compatible. Pluggable storage inside Cassandra allows applications to continue 
using Cassandra, and with a high degree of compatibility.

An application can define tables against the engine which make the most sense, 
and all of them can co-exist. Over time, the preferred storage engine might 
change, much like how InnoDB is preferred over MyISAM, which was the original.

> Pluggable storage engine design
> ---
>
> Key: CASSANDRA-13475
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13475
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
>
> In this jira, we discuss how to make Cassandra's storage engine to be 
> pluggable. We will discuss the scope, expectation, and guideline for this 
> project, as well as a detailed design so that we can create sub tasks for 
> each small project.
> Here is a design doc we are currently working on:  
> https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Issue Comment Deleted] (CASSANDRA-9608) Support Java 9

2018-03-31 Thread Brian O'Neill (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian O'Neill updated CASSANDRA-9608:
-
Comment: was deleted

(was: [~snazy], why not use a plain ReentrantLock inside AtomicBTreePartition? 
It's equivalent to using a monitor, and it's much simpler than using a custom 
spin lock. Also, is it possible to break the patch up and make incremental 
non-breaking changes? This will make it easier later when its decided how best 
to support Java 9.)

> Support Java 9
> --
>
> Key: CASSANDRA-9608
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9608
> Project: Cassandra
>  Issue Type: Task
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
>
> This ticket is intended to group all issues found to support Java 9 in the 
> future.
> From what I've found out so far:
> * Maven dependency {{com.sun:tools:jar:0}} via cobertura cannot be resolved. 
> It can be easily solved using this patch:
> {code}
> - artifactId="cobertura"/>
> + artifactId="cobertura">
> +  
> +
> {code}
> * Another issue is that {{sun.misc.Unsafe}} no longer contains the methods 
> {{monitorEnter}} + {{monitorExit}}. These methods are used by 
> {{o.a.c.utils.concurrent.Locks}} which is only used by 
> {{o.a.c.db.AtomicBTreeColumns}}.
> I don't mind to start working on this yet since Java 9 is in a too early 
> development phase.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-9608) Support Java 9

2018-03-31 Thread Brian O'Neill (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16421361#comment-16421361
 ] 

Brian O'Neill commented on CASSANDRA-9608:
--

[~snazy], why not use a plain ReentrantLock inside AtomicBTreePartition? It's 
equivalent to using a monitor, and it's much simpler than using a custom spin 
lock. Also, is it possible to break the patch up and make incremental 
non-breaking changes? This will make it easier later when its decided how best 
to support Java 9.

> Support Java 9
> --
>
> Key: CASSANDRA-9608
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9608
> Project: Cassandra
>  Issue Type: Task
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
>
> This ticket is intended to group all issues found to support Java 9 in the 
> future.
> From what I've found out so far:
> * Maven dependency {{com.sun:tools:jar:0}} via cobertura cannot be resolved. 
> It can be easily solved using this patch:
> {code}
> - artifactId="cobertura"/>
> + artifactId="cobertura">
> +  
> +
> {code}
> * Another issue is that {{sun.misc.Unsafe}} no longer contains the methods 
> {{monitorEnter}} + {{monitorExit}}. These methods are used by 
> {{o.a.c.utils.concurrent.Locks}} which is only used by 
> {{o.a.c.db.AtomicBTreeColumns}}.
> I don't mind to start working on this yet since Java 9 is in a too early 
> development phase.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org