PDavid commented on code in PR #7419: URL: https://github.com/apache/hbase/pull/7419#discussion_r2509512212
########## hbase-website/README.md: ########## @@ -0,0 +1,612 @@ +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +--> + +# Apache HBase Website + +The official website for Apache HBase, built with modern web technologies to provide a fast, accessible, and maintainable web presence. + +--- + +## Table of Contents + +- [Content Editing](#content-editing) +- [Development](#development) + - [Prerequisites](#prerequisites) + - [Technology Stack](#technology-stack) + - [Project Architecture](#project-architecture) + - [Getting Started](#getting-started) + - [Development Workflow](#development-workflow) + - [Building for Production](#building-for-production) + - [Maven Integration](#maven-integration) + - [Deployment](#deployment) + - [Troubleshooting](#troubleshooting) + +--- + +## Content Editing + +Most pages (except the home page) store content in **Markdown (`.md`)** or **JSON (`.json`)** files located in `app/pages/[page-name]/`. This makes it easy to update content without touching code. + +**Examples:** +- `app/pages/team/content.md` - Markdown content for team page +- `app/pages/powered-by-hbase/companies.json` - JSON data for companies +- `app/pages/news/events.json` - JSON data for news/events + +Edit these files with any text editor, then run `npm run build` to regenerate the site. + +--- + +## Development + +### Prerequisites + +Before you begin, ensure you have the following installed: + +- **Node.js version 20.19+ or 22.12+** - JavaScript runtime (like the JVM for Java) + - Download from [nodejs.org](https://nodejs.org/) + - Verify installation: `node --version` (should show v20.19+ or v22.12+) + +- **NPM** - Node Package Manager (like Maven for Java) + - Comes bundled with Node.js + - Verify installation: `npm --version` + +### Technology Stack + +This website uses modern web technologies. Here's what each one does (with Java analogies): + +#### Core Framework +- **React Router** - Full-stack web framework with Server-Side Generation (SSG) + - Handles routing (like Spring MVC controllers) + - Provides server-side rendering for better performance and SEO + - Enables progressive enhancement (see below) + - [Documentation](https://reactrouter.com/) + +#### Progressive Enhancement + +The website uses **progressive enhancement** ([learn more](https://reactrouter.com/explanation/progressive-enhancement)), which means: + +- **With JavaScript enabled**: Users get a Single Page Application (SPA) experience + - Fast page transitions without full page reloads + - Smooth animations and interactive features + - Enhanced user experience + +- **Without JavaScript**: Users still get a fully functional website + - All links and forms work via traditional HTML + - Content is accessible to everyone + - Better for search engines and accessibility tools + +This approach ensures the website works for all users, regardless of their browser capabilities or connection speed. + +#### UI Components +- **shadcn/ui** - Pre-built, accessible UI components + - Built on top of Radix UI primitives + - Similar to a component library like PrimeFaces or Vaadin in Java + - Provides buttons, cards, navigation menus, etc. + - [Documentation](https://ui.shadcn.com/) + +- **Radix UI** - Low-level, accessible UI primitives + - The foundation that shadcn/ui builds upon + - Handles complex accessibility (ARIA) requirements automatically + - Think of it as the "Spring Framework" for UI components + +#### Styling +- **TailwindCSS** - Utility-first CSS framework + - Instead of writing CSS files, you apply classes directly in components + - Example: `className="text-blue-500 font-bold"` makes blue, bold text + +#### Code Quality Tools +- **TypeScript** - Typed superset of JavaScript + - Similar to Java's type system + - Catches errors at compile-time instead of runtime + - Provides autocomplete and better IDE support + +- **ESLint + Prettier** - Code linting and formatting (like Checkstyle + google-java-format) + - ESLint analyzes code for potential errors and enforces coding standards + - Prettier handles automatic code formatting (spacing, indentation, etc.) + - Integrated together: `npm run lint:fix` handles both linting and formatting + - Configuration: `eslint.config.js` and `prettier.config.js` + +### Project Architecture + +The project follows a clear directory structure with separation of concerns: + +``` +my-react-router-app/ +├── app/ # Application source code +│ ├── ui/ # Reusable UI components (no business logic) +│ │ ├── button.tsx # Generic button component +│ │ ├── card.tsx # Card container component +│ │ └── ... # Other UI primitives +│ │ +│ ├── components/ # Reusable components WITH business logic +│ │ ├── site-navbar.tsx # Website navigation bar +│ │ ├── site-footer.tsx # Website footer +│ │ ├── theme-toggle.tsx # Dark/light mode toggle +│ │ └── markdown-layout.tsx # Layout for markdown content pages +│ │ +│ ├── pages/ # Complete pages (composed of ui + components) +│ │ ├── home/ # Home page +│ │ │ ├── index.tsx # Main page component (exported) +│ │ │ ├── hero.tsx # Hero section (not exported) +│ │ │ ├── features.tsx # Features section (not exported) +│ │ │ └── ... +│ │ ├── team/ # Team page +│ │ │ ├── index.tsx # Main page component (exported) +│ │ │ └── content.md # Markdown content +│ │ └── ... +│ │ +│ ├── routes/ # Route definitions and metadata +│ │ ├── home.tsx # Home route configuration +│ │ ├── team.tsx # Team route configuration +│ │ └── ... +│ │ +│ ├── lib/ # Utility functions and integrations +│ │ ├── utils.ts # Helper functions +│ │ └── theme-provider.tsx # Theme management +│ │ +│ ├── routes.ts # Main routing configuration +│ ├── root.tsx # Root layout component +│ └── app.css # Global styles +│ +├── build/ # Generated files (DO NOT EDIT) +│ ├── client/ # Browser-side assets +│ │ ├── index.html # HTML files for each page +│ │ ├── assets/ # JavaScript, CSS bundles +│ │ └── images/ # Optimized images +│ └── server/ # Server-side code (if using SSR) +│ +├── public/ # Static files (copied as-is to build/) +│ ├── favicon.ico # Website icon +│ └── images/ # Images and other static assets +│ +├── node_modules/ # Dependencies (like Maven's .m2 directory) +├── package.json # Project metadata and dependencies (like pom.xml) +├── tsconfig.json # TypeScript configuration +└── react-router.config.ts # React Router framework configuration +``` + +#### Key Principles + +1. **UI Components (`/ui`)**: Pure, reusable components with no business logic + - Can be used anywhere in the application + - Only concerned with appearance and basic interaction + +2. **Business Components (`/components`)**: Reusable across pages + - May contain business logic specific to HBase website + - Examples: navigation, footer, theme toggle + +3. **Pages (`/pages`)**: Complete pages combining ui and components + - Each page has its own directory + - Only `index.tsx` is exported + - Internal components stay within the page directory + - If a component needs to be reused, move it to `/components` + +4. **Routes (`/routes`)**: Define routing and metadata + - Maps URLs to pages + - Sets page titles, meta tags, etc. + +### Getting Started + +#### 1. Install Dependencies + +Think of this as `mvn install`: + +```bash +npm install +``` + +This downloads all required packages from npm (similar to Maven Central). + +#### 2. Start Development Server + +```bash +npm run dev +``` + +This starts a local development server with: +- **Hot Module Replacement (HMR)**: Code changes appear instantly without full page reload +- **Live at**: `http://localhost:5173` + +### Development Workflow + +#### Making Changes + +1. **Edit code** in the `app/` directory +2. **Save the file** - changes appear automatically in the browser +3. **Check for errors** in the terminal where `npm run dev` is running + +#### Common Tasks + +**Add a new page:** +1. Create directory in `app/pages/my-new-page/` +2. Create `index.tsx` in that directory +3. Create route file in `app/routes/my-new-page.tsx` +4. Register route in `app/routes.ts` + +**Update content:** +- Edit the appropriate `.md` or `.json` file +- Changes appear automatically + +**Add a UI component:** +- Check if shadcn/ui has what you need first +- Only create custom components if necessary + +**Check code quality:** +```bash +npm run lint +``` + +**Fix linting and formatting issues:** +```bash +npm run lint:fix +``` + +### Testing + +The project uses [Vitest](https://vitest.dev/) for testing React components. + +**Run tests:** +```bash +# Run tests in watch mode (for development) +npm test + +# Run tests once (for CI/CD) +npm run test:run + +# Run tests with UI +npm run test:ui +``` + +**Test coverage includes:** +- Home Page - Hero section, buttons, features, use cases, community sections +- Theme Toggle - Light/dark mode switching +- Navigation - Navbar, dropdown menus, links +- Markdown Rendering - Headings, lists, code blocks, tables, links + +**Writing new tests:** + +Use the `renderWithProviders` utility in `test/utils.tsx` to ensure components have access to routing and theme context: + +```typescript +import { renderWithProviders, screen } from './utils' +import { MyComponent } from '@/components/my-component' + +describe('MyComponent', () => { + it('renders correctly', () => { + renderWithProviders(<MyComponent />) + expect(screen.getByText('Hello World')).toBeInTheDocument() + }) +}) +``` + +**CI/CD Workflow:** + +Before merging or deploying, run the full CI pipeline: + +```bash +npm run ci +``` + +This command runs all quality checks and builds the project: +1. `npm run lint` - Check linting +2. `npm run typecheck` - Check types +3. `npm run test:run` - Run tests +4. `npm run build` - Build for production + +All checks must pass before code is considered ready for deployment. + +**CI/CD Pipeline Example:** +```yaml +# Example for GitHub Actions, GitLab CI, etc. +- npm run ci # Runs all checks and build +``` + +### Building for Production + +Create an optimized production build: + +```bash +npm run build +``` + +This command: +1. Compiles TypeScript to JavaScript +2. Bundles and minifies all code +3. Optimizes images and assets +4. Generates static HTML files +5. Outputs everything to `build/` directory + +**Generated files location:** +``` +build/ +├── client/ # Everything needed for the website +│ ├── *.html # Pre-rendered HTML pages +│ ├── assets/ # Optimized JavaScript and CSS +│ │ ├── *.js # JavaScript bundles (minified) +│ │ ├── *.css # Stylesheets (minified) +│ │ └── manifest-*.js # Asset manifest +│ └── images/ # Optimized images +└── server/ # Server-side code (if applicable) +``` + +The `build/client/` directory contains everything needed to deploy the website to any static file host. + +### Maven Integration + +The website is integrated with the Apache HBase Maven build system using the `frontend-maven-plugin`. This allows the website to be built as part of the main HBase build or separately using Maven commands. + +#### What Gets Executed + +When you run the Maven build, it automatically: + +1. **Installs Node.js v22.20.0 and npm 10.5.0** (if not already available) + - Installed to `target/` directory + - Does not affect your system Node/npm installation + +2. **Runs `npm install`** to install all dependencies + - Reads from `package.json` + - Installs to `node_modules/` + +3. **Runs `npm run ci`** which executes: + - `npm run lint` - ESLint code quality checks + - `npm run typecheck` - TypeScript type checking + - `npm run test:run` - Vitest unit tests + - `npm run build` - Production build + +4. **Build Output**: Generated files are in `build/` directory + +#### Maven Commands + +**Build Website with Full HBase Build:** +```bash +# From HBase root directory +mvn clean install +``` + +The website will be built automatically as part of the full build. + +**Build Website Only:** +```bash +# Option 1: From HBase root directory +mvn clean install -pl src/site Review Comment: I think mentions of `src/site` should be replaced. ########## hbase-website/app/pages/acid-semantics/content.md: ########## @@ -0,0 +1,126 @@ +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +--> + +# ACID properties of HBase + +Apache HBase (TM) is not an ACID compliant database. However, it does guarantee certain specific properties. + +## Definitions + +For the sake of common vocabulary, we define the following terms: + +**Atomicity** +an operation is atomic if it either completes entirely or not at all + +**Consistency** +all actions cause the table to transition from one valid state directly to another (eg a row will not disappear during an update, etc) + +**Isolation** +an operation is isolated if it appears to complete independently of any other concurrent transaction + +**Durability** +any update that reports "successful" to the client will not be lost + +**Visibility** +an update is considered visible if any subsequent read will see the update as having been committed + +The terms _must_ and _may_ are used as specified by RFC 2119. In short, the word "must" implies that, if some case exists where the statement is not true, it is a bug. The word "may" implies that, even if the guarantee is provided in a current release, users should not rely on it. + +## APIs to consider + +- Read APIs + - get + - scan +- Write APIs + - put + - batch put + - delete +- Combination (read-modify-write) APIs + - incrementColumnValue + - checkAndPut + +## Guarantees Provided + +### Atomicity + +1. All mutations are atomic within a row. Any put will either wholly succeed or wholly fail.[3] + 1. An operation that returns a "success" code has completely succeeded. + 2. An operation that returns a "failure" code has completely failed. + 3. An operation that times out may have succeeded and may have failed. However, it will not have partially succeeded or failed. +2. This is true even if the mutation crosses multiple column families within a row. +3. APIs that mutate several rows will _not_ be atomic across the multiple rows. For example, a multiput that operates on rows 'a','b', and 'c' may return having mutated some but not all of the rows. In such cases, these APIs will return a list of success codes, each of which may be succeeded, failed, or timed out as described above. +4. The checkAndPut API happens atomically like the typical compareAndSet (CAS) operation found in many hardware architectures. +5. The order of mutations is seen to happen in a well-defined order for each row, with no interleaving. For example, if one writer issues the mutation "a=1,b=1,c=1" and another writer issues the mutation "a=2,b=2,c=2", the row must either be "a=1,b=1,c=1" or "a=2,b=2,c=2" and must _not_ be something like "a=1,b=2,c=1". + 1. Please note that this is not true _across rows_ for multirow batch mutations. + +### Consistency and Isolation + +1. All rows returned via any access API will consist of a complete row that existed at some point in the table's history. +2. This is true across column families - i.e a get of a full row that occurs concurrent with some mutations 1,2,3,4,5 will return a complete row that existed at some point in time between mutation i and i+1 for some i between 1 and 5. +3. The state of a row will only move forward through the history of edits to it. + +#### Consistency of Scans + +A scan is **not** a consistent view of a table. Scans do **not** exhibit _snapshot isolation_. + +Rather, scans have the following properties: + +1. Any row returned by the scan will be a consistent view (i.e. that version of the complete row existed at some point in time) [1] +2. A scan will always reflect a view of the data _at least as new as_ the beginning of the scan. This satisfies the visibility guarantees enumerated below. + 1. For example, if client A writes data X and then communicates via a side channel to client B, any scans started by client B will contain data at least as new as X. + 2. A scan _must_ reflect all mutations committed prior to the construction of the scanner, and _may_ reflect some mutations committed subsequent to the construction of the scanner. + 3. Scans must include _all_ data written prior to the scan (except in the case where data is subsequently mutated, in which case it _may_ reflect the mutation) + +Those familiar with relational databases will recognize this isolation level as "read committed". + +Please note that the guarantees listed above regarding scanner consistency are referring to "transaction commit time", not the "timestamp" field of each cell. That is to say, a scanner started at time _t_ may see edits with a timestamp value greater than _t_, if those edits were committed with a "forward dated" timestamp before the scanner was constructed. + +### Visibility + +1. When a client receives a "success" response for any mutation, that mutation is immediately visible to both that client and any client with whom it later communicates through side channels. [3] +2. A row must never exhibit so-called "time-travel" properties. That is to say, if a series of mutations moves a row sequentially through a series of states, any sequence of concurrent reads will return a subsequence of those states. + 1. For example, if a row's cells are mutated using the "incrementColumnValue" API, a client must never see the value of any cell decrease. + 2. This is true regardless of which read API is used to read back the mutation. +3. Any version of a cell that has been returned to a read operation is guaranteed to be durably stored. + +### Durability + +1. All visible data is also durable data. That is to say, a read will never return data that has not been made durable on disk[2] +2. Any operation that returns a "success" code (eg does not throw an exception) will be made durable.[3] +3. Any operation that returns a "failure" code will not be made durable (subject to the Atomicity guarantees above) +4. All reasonable failure scenarios will not affect any of the guarantees of this document. + +### Tunability + +All of the above guarantees must be possible within Apache HBase. For users who would like to trade off some guarantees for performance, HBase may offer several tuning options. For example: + +- Visibility may be tuned on a per-read basis to allow stale reads or time travel. +- Durability may be tuned to only flush data to disk on a periodic basis + +## More Information + +For more information, see the [client architecture](book.html#client) or [data model](book.html#datamodel) sections in the Apache HBase Reference Guide. Review Comment: These links are not correct here, right now they will be: - https://hbase.apache.org/acid-semantics/book.html#client - https://hbase.apache.org/acid-semantics/book.html#datamodel I think these should be: - https://hbase.apache.org/book.html#client - https://hbase.apache.org/book.html#datamodel ########## hbase-website/app/components/getting-started.tsx: ########## @@ -0,0 +1,77 @@ +// +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. +// + +import { Button } from "@/ui/button"; +import { Link } from "react-router"; + +export function GettingStartedSection() { + const steps = [ + { + title: "1. Download", + desc: "Grab the latest stable release and verify checksums.", + to: "/downloads" + }, + { + title: "2. Read the Guide", + desc: "Walk through cluster setup, schema design, and operations.", + to: "https://hbase.apache.org/book.html#_get_started_with_hbase" + }, + { + title: "3. Connect a Client", + desc: "Use the Java API, REST, or Thrift to start building.", + to: "https://hbase.apache.org/book.html#config.files" Review Comment: Right now here we link to the Default Configuration Files chapter but I believe linking to "Apache HBase APIs" (https://hbase.apache.org/book.html#hbase_apis) would be better. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
