[Cassandra Wiki] Update of "TopLevelPackages" by daniels

Apache Wiki Sun, 18 May 2014 15:58:25 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The "TopLevelPackages" page has been changed by daniels:
https://wiki.apache.org/cassandra/TopLevelPackages?action=diff&rev1=1&rev2=2

  
  === io ===
  
- This large package talks to the file system on behalf of C*. The bulk of this 
work consists of creating and using SSTables, which is the format that C* uses 
to store data. Other responsibilities include on-disk compression as well as 
some general-purpose I/O functionality for other code to use, including 
facilities for custom memory management. The main class here is SSTable, which 
represents an abstract persistent container of sorted data. SSTableWriter and 
SSTableReader are derived from SSTable and expose additional read and write 
functionality, and are used quite extensively by other packages (primarily db).
+ This large package talks to the file system on behalf of C*. The bulk of this 
work consists of creating and using `SSTables`, which is the format that C* 
uses to store data. Other responsibilities include on-disk compression as well 
as some general-purpose I/O functionality for other code to use, including 
facilities for custom memory management. The main class here is `SSTable`, 
which represents an abstract persistent container of sorted data. 
`SSTableWriter` and `SSTableReader` are derived from `SSTable` and expose 
additional read and write functionality, and are used quite extensively by 
other packages (primarily `db`).
  
  === db ===
  
- This huge package takes up almost a quarter of the entire codebase and 
implements the database engine. It operates in familiar database terms 
including Cells, Rows, ColumnFamilies (tables) and Keyspaces (databases). db 
heavily relies on io.SSTables for data persistence, but also reaches into many 
other packages for various tasks. Internally, db can be broken down into its 
own sublayers and also contains multiple subpackages for things such as data 
marshalling, commit log management, storage compaction and others. Overall, db 
is large and complicated enough to deserve an architectural study of its own.
+ This huge package takes up almost a quarter of the entire codebase and 
implements the database engine. It operates in familiar database terms 
including `Cells`, `Rows`, `ColumnFamilies` (tables) and `Keyspaces` 
(databases). `db` heavily relies on `io.SSTables` for data persistence, but 
also reaches into many other packages for various tasks. Internally, `db` can 
be broken down into its own sublayers and also contains multiple subpackages 
for things such as data marshalling, commit log management, storage compaction 
and others. Overall, `db` is large and complicated enough to deserve an 
architectural study of its own.
  
  === serializers ===
  
- This small package is subordinate to db, and contains utility methods for 
converting primitive types to byte buffers.
+ This small package is subordinate to `db`, and contains utility methods for 
converting primitive types to byte buffers.
  
  === notifications ===
  
- This tiny package contains interfaces and classes that allow other code to 
hook certain internal db events with custom code. It can be used for things 
such as unit tests, but may also be hooked into with other external 
functionality.
+ This tiny package contains interfaces and classes that allow other code to 
hook certain internal `db` events with custom code. It can be used for things 
such as unit tests, but may also be hooked into with other external 
functionality.
  
  === cache ===
  
- cache is a smaller package containing primitives used to implement key and 
row caching. The class that actually orchestrates all that caching activity 
(CacheService) lives in service, not cache. cache is primarily used for the 
benefit of db, but db almost never uses cache directly, instead proxying 
through service.CacheService. Without this extra indirection, cache could 
easily be structured as a subpackage under db.
+ `cache` is a smaller package containing primitives used to implement key and 
row caching. The class that actually orchestrates all that caching activity 
(`CacheService`) lives in service, not `cache`. `cache` is primarily used for 
the benefit of `db`, but `db` almost never uses `cache` directly, instead 
proxying through `service.CacheService`. Without this extra indirection, 
`cache` could easily be structured as a subpackage under `db`.
  
  === cql3 ===
  
- Read and write APIs provided by db are difficult to use directly, so C* 
provides a query language for easier access to the underlying data–Cassandra 
Query Language or CQL. The language is implemented in its own package cql3 (the 
third major release of the language). cql3 defines the language grammar and 
implements the QueryProcessor as well as all the Statements and related 
functionality available in CQL. It is interesting that cql3 is not a purely 
externally facing API; some internal code actually leverages it to store and 
retrieve system state information. In that respect, CQL is becoming a core 
component.
+ Read and write APIs provided by `db` are difficult to use directly, so C* 
provides a query language for easier access to the underlying data–Cassandra 
Query Language or CQL. The language is implemented in its own package `cql3` 
(the third major release of the language). `cql3` defines the language grammar 
and implements the `QueryProcessor` as well as all the `Statements` and related 
functionality available in CQL. It is interesting that `cql3` is not a purely 
externally facing API; some internal code actually leverages it to store and 
retrieve system state information. In that respect, CQL is becoming a core 
component.
  
  === net ===
  
- This package implements MessagingService, which abstracts away most 
networking machinery from the rest of the codebase. Other packages can then set 
up communication protocols represented by different Verbs and send custom 
Messages carrying those verbs to remote nodes. Verbs are handled on the 
receiving side with VerbHandlers. net provides only base types and 
functionality common to all messages; specialized implementation live in 
various other packages.
+ This package implements `MessagingService`, which abstracts away most 
networking machinery from the rest of the codebase. Other packages can then set 
up communication protocols represented by different `Verbs` and send custom 
`Messages` carrying those verbs to remote nodes. `Verbs` are handled on the 
receiving side with `VerbHandlers`. `net` provides only base types and 
functionality common to all messages; specialized implementation live in 
various other packages.
  
  === sink ===
  
- Used by net and service, this tiny package is used to hook into messaging 
events. This is primarily useful for unit tests.
+ Used by `net` and `service`, this tiny package is used to hook into messaging 
events. This is primarily useful for unit tests.
  
  === security ===
  
- This tiny package, currently containing only one class SSLFactory, is used to 
encrypt communication over the network.
+ This tiny package, currently containing only one class `SSLFactory`, is used 
to encrypt communication over the network.
  
  === gms ===
  
- gms (possibly standing for Gossip Message Service) implements the Gossiper. 
Gossiper is a peer-to-peer service that deals with disseminating cluster state 
information among member nodes. Gossiping consists of detecting unresponsive 
nodes using heart beat messages, and sharing liveness data among peers.
+ `gms` (possibly standing for Gossip Message Service) implements the 
`Gossiper`. `Gossiper` is a peer-to-peer service that deals with disseminating 
cluster state information among member nodes. Gossiping consists of detecting 
unresponsive nodes using heart beat messages, and sharing liveness data among 
peers.
  
  === locator ===
  
- locator is responsible for two separate tasks. One is discovering cluster 
topology through a pluggable component called Snitch. A few snitches are 
available out of the box, and some dynamic implementations heavily rely on 
Gossiper to detect up-to-date cluster topology. The second responsibility is 
deciding how to optimally distribute replicas based on discovered topology 
(handled by a class called ReplicationStrategy and its subclasses).
+ `locator` is responsible for two separate tasks. One is discovering cluster 
topology through a pluggable component called `Snitch`. A few snitches are 
available out of the box, and some dynamic implementations heavily rely on 
`Gossiper` to detect up-to-date cluster topology. The second responsibility is 
deciding how to optimally distribute replicas based on discovered topology 
(handled by a class called `ReplicationStrategy` and its subclasses).
  
  === streaming ===
  
- Another core networking package, streaming is responsible for moving bulk 
data between the nodes in the cluster.
+ Another core networking package, `streaming` is responsible for moving bulk 
data between the nodes in the cluster.
  
  === repair ===
  
- This smaller package deals with running RepairSessions, which redistribute 
data after a change in the cluster, or when corruption is detected in one of 
the existing nodes. Repair events are just one example where streaming is used.
+ This smaller package deals with running `RepairSessions`, which redistribute 
data after a change in the cluster, or when corruption is detected in one of 
the existing nodes. Repair events are just one example where `streaming` is 
used.
  
  === service ===
  
- service, although not the largest package, can be thought of as the skeleton 
upon which all other functionality builds. service consists of an executable 
class CassandraDaemon (this class contains the main() function of the C* 
daemon), along with a set of core services, including StorageProxy and 
StorageService. A lot of, if not most, of inter-package communication within 
the codebase is brokered through one of those two. StorageService is more 
involved in orchestrating Dynamo-level activities in the cluster, whereas 
StorageProxy is more focused on handling data transfer.
+ `service`, although not the largest package, can be thought of as the 
skeleton upon which all other functionality builds. `service` consists of an 
executable class `CassandraDaemon` (this class contains the `main()` function 
of the C* daemon), along with a set of core services, including `StorageProxy` 
and `StorageService`. A lot of, if not most, of inter-package communication 
within the codebase is brokered through one of those two. `StorageService` is 
more involved in orchestrating Dynamo-level activities in the cluster, whereas 
`StorageProxy` is more focused on handling data transfer.
  
- service is critical to most other modules, many of which are free to call 
into it from arbitrary places. A traditional weakness of such omnipresent 
uberpackages is that they attract all sort of miscellaneous functionality that 
doesn’t seem to belong anywhere else, and service is no exception: expect to 
see a lot of random bits and pieces residing here.
+ `service` is critical to most other modules, many of which are free to call 
into it from arbitrary places. A traditional weakness of such omnipresent 
uberpackages is that they attract all sort of miscellaneous functionality that 
doesn’t seem to belong anywhere else, and `service` is no exception: expect to 
see a lot of random bits and pieces residing here.
  
  === config ===
  
- Another omnipackage seemingly accessible from anywhere, config is a 
repository for configurable settings as well as a static entry point into the 
data store (through a class Schema which contains a reference to all keyspaces 
residing in the local cluster).
+ Another omnipackage seemingly accessible from anywhere, `config` is a 
repository for configurable settings as well as a static entry point into the 
data store (through a class `Schema` which contains a reference to all 
keyspaces residing in the local cluster).
  
  === transport ===
  
- This is one of external API providers. transport implements a Server that 
listens for connecting clients that want to use C* Native protocol, which as of 
Cassandra 2.0 is the primary communication protocol both for external clients 
and within the cluster.
+ This is one of external API providers. `transport` implements a Server that 
listens for connecting clients that want to use C* Native protocol, which as of 
Cassandra 2.0 is the primary communication protocol both for external clients 
and within the cluster.
  
  === thrift ===
  
@@ -98, +98 @@

  
  === tools ===
  
- This package contains the implementation of several administrative utilities 
shipped with C*, including the Node tool, tools for import and export, as well 
as several utilities for SSTable maintenance. All these tools are available 
under /bin in a typical distribution.
+ This package contains the implementation of several administrative utilities 
shipped with C*, including the `Node` tool, tools for import and export, as 
well as several utilities for `SSTable` maintenance. All these tools are 
available under `/bin` in a typical distribution.
  
  === dht ===
  
- dht (Distributed Hash Table) is a core support class that is responsible for 
partitioning data among the nodes in the cluster. It contains several pluggable 
implementations of AbstractPartitioner class that handles the mechanics of data 
partitioning. In addition it defines Range and Token, which are primitives used 
by other packages to work with partition key ranges.
+ `dht` (Distributed Hash Table) is a core support class that is responsible 
for partitioning data among the nodes in the cluster. It contains several 
pluggable implementations of `AbstractPartitioner` class that handles the 
mechanics of data partitioning. In addition it defines `Range` and `Token`, 
which are primitives used by other packages to work with partition key ranges.
  
  === utils ===
  
- This hefty package is a grab bag of miscellaneous classes typical to any 
software project, the proverbial "other" section. It is not the best place to 
look for architectural pillars, but it contains some clever code that is 
partially responsible for Cassandra’s impressive perf and reliability, 
including implementations of BloomFilter and MerkleTree among others.
+ This hefty package is a grab bag of miscellaneous classes typical to any 
software project, the proverbial "other" section. It is not the best place to 
look for architectural pillars, but it contains some clever code that is 
partially responsible for Cassandra’s impressive perf and reliability, 
including implementations of `BloomFilter` and `MerkleTree` among others.
  
  === concurrent ===
  
- concurrent deals with threading and thread pools. Interestingly, the few 
custom concurrency primitives that C* uses belong to a level 2 package under 
utils, and not to this package.
+ `concurrent` deals with threading and thread pools. Interestingly, the few 
custom concurrency primitives that C* uses belong to a level 2 package under 
`utils`, and not to this package.
  
  === exceptions ===
  
@@ -122, +122 @@

  
  === tracing ===
  
- tracing implements support for request tracing, whereupon some or all 
requests to Cassandra will cause verbose logging to be output for the purposes 
of debugging or performance tuning.
+ `tracing` implements support for request tracing, whereupon some or all 
requests to Cassandra will cause verbose logging to be output for the purposes 
of debugging or performance tuning.
  
  === metrics ===
  
- This package allows collecting quantitative data about various aspects of C* 
operation. Metric data can be accessed through Node tool that ships with C*. 
Like tracing, this capability can be important for operational maintenance and 
troubleshooting.
+ This package allows collecting quantitative data about various aspects of C* 
operation. Metric data can be accessed through `Node` tool that ships with C*. 
Like tracing, this capability can be important for operational maintenance and 
troubleshooting.
  
  === auth ===
  
- auth enables support for authentication and authorization, providing a 
measure of access control for C* service.
+ `auth` enables support for authentication and authorization, providing a 
measure of access control for C* service.
  
  === cli ===
  
- cli implements a Command Line Interface client for interacting with a C* 
cluster from a remote node.
+ `cli` implements a Command Line Interface client for interacting with a C* 
cluster from a remote node.
  
  === hadoop ===
  
- hadoop exposes C* in terms of MapReduce/Pig primitives, allowing integration 
with Hadoop clients. This package is expected to be used as an adapter in 
external Hadoop applications; there is no code here that actually runs server 
side.
+ `hadoop` exposes C* in terms of MapReduce/Pig primitives, allowing 
integration with Hadoop clients. This package is expected to be used as an 
adapter in external Hadoop applications; there is no code here that actually 
runs server side.
  
  === client ===
  
- A tiny package that provides helper functionality to code that runs against 
C* on the client side. Only hadoop uses client out of the box.
+ A tiny package that provides helper functionality to code that runs against 
C* on the client side. Only `hadoop` uses `client` out of the box.

[Cassandra Wiki] Update of "TopLevelPackages" by daniels

Reply via email to