Re: [discuss] Modernization of Cassandra build system

2015-04-11 Thread Łukasz Dywicki
Sorry for not coming back to topic for long time.

You are right that what Cassandra project have currently - does work and 
keeping package scoping discipline in such big development community as 
Cassandra is clearly impossible without tool support (if you insist to keep ant 
please try to separate javac tasks for logical parts in current build to verify 
that). I clearly pointed out that it doesn’t work in reliable way causing 
troubles with artifacts uploaded to maven central. As I briefly counted in my 
ealier mail there was 116 issues related to artifacts published by build 
process. It is a lot and these changes requires another mainanance releases to 
fix for example one or another bytecode level dependency causing 
NoClassDefErrors with invalid artifacts. According to some recordings from 
DataStax there is a plan to support in Cassandra multiple kinds of store - 
document, graph so it won’t get easier with the time but rather harder - ask 
yourself do you really want to mess all these things together?

Starting from 2.x Cassandra supports triggers but writing even a simplest 
trigger which will drop a log message or publish UDP packet requires entire 
cassandra and all it’s dependencies to be present during development.
Fact that everything sits in one big ant build.xml is caused by troubles 
generated by ant itself to support multiple build modules, placeholders and so 
on, not because it’s handsome to do such. 

Modernization of build and internal dependencies is not something which brings 
huge benefit in first run cause now your frontend is CQL, however it gives real 
boost when it comes to community donations, tool development, or even 
debugging. Sadly keeping current Ant build is silent agreement to keep mess 
internally and rickety architecture of project. Ant was already legacy tool 
when Cassandra has been launched. The longer you will stay with it the more 
troubles you will get with it over time.

Kind regards,
Lukasz


 Wiadomość napisana przez Robert Stupp sn...@snazy.de w dniu 2 kwi 2015, o 
 godz. 14:51:
 
 TL;DR - Benedict is right.
 
 IMO Maven is a nice, straight-forward tool if you know what you’re doing and 
 start on a _new_ project.
 But Maven easily becomes a pita if you want to do something that’s not 
 supported out-of-the-box.
 I bet that Maven would just not work for C* source tree with all the little 
 nice features that C*’s build.xml offers (just look at the scripted stuff in 
 build.xml).
 
 Eventually gradle would be an option; I proposed to switch to gradle several 
 months ago. Same story (although gradle is better than Maven ;) ).
 But… you need to know that build.xml is not just used to build the code and 
 artifacts. It is also used in CI, ccm, cstar-perf and a some other custom 
 systems that exist and just work. So - if we would exchange ant with 
 something else, it would force a lot of effort to change several tools and 
 systems. And there must be a guarantee that everything works like it did 
 before.
 
 Regarding IDEs: i’m using IDEA every day and it works like a charm with C*. 
 Eclipse is ”supported natively” by ”ant generate-eclipse-files”. TBH I don’t 
 know NetBeans.
 
 As Benedict pointed out, the code has improved and still improves a lot - in 
 structure, in inline-doc, in nomenclature and whatever else. As soon as we 
 can get rid of Thrift in the tree, there’s another big opportunity to cleanup 
 more stuff.
 
 TBH I don’t think that (beside the tools) there would be a need to generate 
 multiple artifacts for C* daemon - you can do ”separation of concerns” (via 
 packages) even with discipline and then measure it.
 IMO The only artifact worth to extract out of C* tree, and useful for a 
 (limited) set of 3rd party code, is something like 
 ”cassandra-jmx-interfaces.jar”
 
 Robert
 
 Am 02.04.2015 um 11:30 schrieb Benedict Elliott Smith 
 belliottsm...@datastax.com:
 
 There are three distinct problems you raise: code structure, documentation,
 and build system.
 
 The build system, as far as I can tell, is a matter of personal preference.
 I personally dislike the few interactions I've had with maven, but
 gratefully my interactions with build system innards have been fairly
 limited. I mostly just use them. Unless a concrete and significant benefit
 is delivered by maven, though, it just doesn't seem worth the upheaval to
 me. If you can make the argument that it actually improves the project in a
 way that justifies the upheaval, it will certainly be considered, but so
 far no justification has been made.
 
 The documentation problem is common to many projects, though: out of
 codebase documentation gets stale very rapidly. When we say to read the
 code we mean read the code and its inline documentation - the quality of
 this documentation has itself generally been substandard, but has been
 improving significantly over the past year or so, and we are endeavouring
 to improve with every change. In the meantime, there are videos from a
 

[BBG-137068] New Ticket: [jira] [Updated] (CASSANDRA-9120) OutOfMemoryError when read auto-saved cache (probably broken)

2015-04-11 Thread support
Reply above this line

 (dev@cassandra.apache.org) said:


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-9120:
--
Assignee: (was: Jeff Jirsa)

 OutOfMemoryError when read auto-saved cache (probably broken)
 -

 Key: CASSANDRA-9120
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9120
 Project: Cassandra
  Issue Type: Bug
 Environment: Linux
Reporter: Vladimir
 Fix For: 3.0, 2.0.15, 2.1.5


 Found during tests on a 100 nodes cluster. After restart I found that one 
 node constantly crashes with OutOfMemory Exception. I guess that auto-saved 
 cache was corrupted and Cassandra can't recognize it. I see that similar 
 issues was already fixed (when negative size of some structure was read). 
 Does auto-saved cache have checksum? it'd help to reject corrupted cache at 
 the very beginning.
 As far as I can see current code still have that problem. Stack trace is:
 {code}
 INFO [main] 2015-03-28 01:04:13,503 AutoSavingCache.java (line 114) reading 
 saved cache 
 /storage/core/loginsight/cidata/cassandra/saved_caches/system-sstable_activity-KeyCache-b.db
 ERROR [main] 2015-03-28 01:04:14,718 CassandraDaemon.java (line 513) 
 Exception encountered during startup
 java.lang.OutOfMemoryError: Java heap space
 at java.util.ArrayList.(Unknown Source)
 at 
 org.apache.cassandra.db.RowIndexEntry$Serializer.deserialize(RowIndexEntry.java:120)
 at 
 org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize(CacheService.java:365)
 at 
 org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:119)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.(ColumnFamilyStore.java:262)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:421)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:392)
 at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:315)
 at org.apache.cassandra.db.Keyspace.(Keyspace.java:272)
 at org.apache.cassandra.db.Keyspace.open(Keyspace.java:114)
 at org.apache.cassandra.db.Keyspace.open(Keyspace.java:92)
 at 
 org.apache.cassandra.db.SystemKeyspace.checkHealth(SystemKeyspace.java:536)
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:261)
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
 {code}
 I looked at source code of Cassandra and see:
 http://grepcode.com/file/repo1.maven.org/maven2/org.apache.cassandra/cassandra-all/2.0.10/org/apache/cassandra/db/RowIndexEntry.java
 119 int entries = in.readInt();
 120 List columnsIndex = new ArrayList(entries);
 It seems that value entries is invalid (negative) and it tries too allocate 
 an array with huge initial capacity and hits OOM. I have deleted saved_cache 
 directory and was able to start node correctly. We should expect that it may 
 happen in real world. Cassandra should be able to skip incorrect cached data 
 and run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
==

Your Blue Box support ticket can be viewed here:
https://support.bluebox.net/tickets/137068

For urgent issues (system down, service outage), open the link above and click 
Escalate, or call Blue Box Support at 1-800-613-4305 ext 1.

Thank You,

The Blue Box Support Team