(hamilton) 01/01: Add Claude Code plugin for AI-assisted Hamilton development

skrawcz Sat, 31 Jan 2026 10:58:18 -0800

This is an automated email from the ASF dual-hosted git repository.

skrawcz pushed a commit to branch stefan/add-skills
in repository https://gitbox.apache.org/repos/asf/hamilton.git


commit f290bc78d20685009b180947a3a66a9890e21dd3
Author: Stefan Krawczyk <[email protected]>
AuthorDate: Sat Jan 31 10:56:57 2026 -0800

    Add Claude Code plugin for AI-assisted Hamilton development
    
    This commit adds a comprehensive Claude Code plugin/skill that provides
    AI-powered assistance for Hamilton DAG development.
    
    Features:
    - Create Hamilton modules with proper patterns and decorators
    - Understand and explain existing DAGs
    - Apply function modifiers (@parameterize, @config.when, @check_output, 
etc.)
    - Convert Python scripts to Hamilton modules
    - Debug issues (circular dependencies, missing nodes, etc.)
    - Optimize pipelines with caching and parallelization
    - Generate tests and documentation
    - LLM/RAG workflow patterns
    - Integration patterns (Airflow, FastAPI, Streamlit, etc.)
    
    Structure:
    - .claude/skills/hamilton/ - Auto-available for Hamilton contributors
    - .claude/plugins/hamilton/ - Installable plugin for external users
    - docs/ecosystem/claude-code-plugin.md - User documentation
    
    Installation for users:
      /plugin marketplace add dagworks-inc/hamilton
      /plugin install hamilton --scope user
    
    All files include Apache 2.0 license headers.
    
    Contributing: Users can file issues or submit PRs to improve the skill.
---
 .../hamilton/.claude-plugin/marketplace.json       |  34 ++
 .../plugins/hamilton/.claude-plugin/plugin.json    |  23 +
 .claude/plugins/hamilton/CHANGELOG.md              |  55 ++
 .claude/plugins/hamilton/README.md                 | 257 +++++++++
 .claude/plugins/hamilton/skills/hamilton/README.md | 232 ++++++++
 .claude/plugins/hamilton/skills/hamilton/SKILL.md  | 474 ++++++++++++++++
 .../plugins/hamilton/skills/hamilton/examples.md   | 610 +++++++++++++++++++++
 .claude/settings.local.json                        |  31 ++
 .claude/skills/hamilton/README.md                  | 251 +++++++++
 .claude/skills/hamilton/SKILL.md                   | 474 ++++++++++++++++
 .claude/skills/hamilton/examples.md                | 610 +++++++++++++++++++++
 docs/ecosystem/claude-code-plugin.md               | 406 ++++++++++++++
 docs/ecosystem/index.md                            |   1 +
 13 files changed, 3458 insertions(+)

diff --git a/.claude/plugins/hamilton/.claude-plugin/marketplace.json 
b/.claude/plugins/hamilton/.claude-plugin/marketplace.json
new file mode 100644
index 00000000..4468df5f
--- /dev/null
+++ b/.claude/plugins/hamilton/.claude-plugin/marketplace.json
@@ -0,0 +1,34 @@
+{
+  "name": "Hamilton Plugin Marketplace",
+  "description": "Official Claude Code plugin for the Hamilton framework",
+  "version": "1.0.0",
+  "owner": {
+    "name": "DAGWorks Inc.",
+    "url": "https://github.com/dagworks-inc";
+  },
+  "plugins": [
+    {
+      "name": "hamilton",
+      "source": "./",
+      "description": "Expert AI assistant for Hamilton framework development - 
create DAGs, apply decorators, debug dataflows, and optimize pipelines",
+      "version": "1.0.0",
+      "author": {
+        "name": "Hamilton Team",
+        "email": "[email protected]"
+      },
+      "homepage": "https://github.com/dagworks-inc/hamilton";,
+      "repository": "https://github.com/dagworks-inc/hamilton";,
+      "license": "Apache-2.0",
+      "keywords": [
+        "hamilton",
+        "dag",
+        "dataflow",
+        "workflow",
+        "pipeline",
+        "data-engineering",
+        "ml-ops",
+        "feature-engineering"
+      ]
+    }
+  ]
+}
diff --git a/.claude/plugins/hamilton/.claude-plugin/plugin.json 
b/.claude/plugins/hamilton/.claude-plugin/plugin.json
new file mode 100644
index 00000000..98b7680b
--- /dev/null
+++ b/.claude/plugins/hamilton/.claude-plugin/plugin.json
@@ -0,0 +1,23 @@
+{
+  "name": "hamilton",
+  "version": "1.0.0",
+  "description": "Expert AI assistant for Hamilton framework development - 
create DAGs, apply decorators, debug dataflows, and optimize pipelines",
+  "author": {
+    "name": "Hamilton Team",
+    "email": "[email protected]"
+  },
+  "homepage": "https://github.com/dagworks-inc/hamilton";,
+  "repository": "https://github.com/dagworks-inc/hamilton";,
+  "license": "Apache-2.0",
+  "keywords": [
+    "hamilton",
+    "dag",
+    "dataflow",
+    "workflow",
+    "pipeline",
+    "data-engineering",
+    "ml-ops",
+    "feature-engineering"
+  ],
+  "skills": "./skills/"
+}
diff --git a/.claude/plugins/hamilton/CHANGELOG.md 
b/.claude/plugins/hamilton/CHANGELOG.md
new file mode 100644
index 00000000..c273dbff
--- /dev/null
+++ b/.claude/plugins/hamilton/CHANGELOG.md
@@ -0,0 +1,55 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Changelog
+
+All notable changes to the Hamilton Claude Code plugin will be documented in 
this file.
+
+The format is based on [Keep a 
Changelog](https://keepachangelog.com/en/1.0.0/),
+and this project adheres to [Semantic 
Versioning](https://semver.org/spec/v2.0.0.html).
+
+## [1.0.0] - 2025-01-31
+
+### Added
+- Initial release of Hamilton Claude Code plugin
+- Comprehensive skill for Hamilton DAG development
+- Support for creating new Hamilton modules with best practices
+- Function modifier guidance (@parameterize, @config.when, @extract_columns, 
@check_output, etc.)
+- Code conversion assistance (Python scripts → Hamilton modules)
+- DAG visualization and understanding
+- Debugging assistance for common issues
+- Data quality validation patterns
+- LLM/RAG workflow examples
+- Feature engineering patterns
+- Integration examples:
+  - Airflow
+  - FastAPI
+  - Streamlit
+  - Jupyter notebooks
+- Parallel execution patterns (ThreadPool, Ray, Dask, Spark)
+- Caching strategies
+- Testing guidance
+
+### Documentation
+- Comprehensive SKILL.md with all Hamilton patterns
+- examples.md with 60+ production-ready code examples
+- README.md with installation and usage instructions
+- Plugin manifest (plugin.json) and marketplace (marketplace.json)
+
+[1.0.0]: 
https://github.com/dagworks-inc/hamilton/releases/tag/claude-plugin-v1.0.0
diff --git a/.claude/plugins/hamilton/README.md 
b/.claude/plugins/hamilton/README.md
new file mode 100644
index 00000000..2336da5b
--- /dev/null
+++ b/.claude/plugins/hamilton/README.md
@@ -0,0 +1,257 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Hamilton Plugin for Claude Code
+
+A comprehensive AI assistant skill for 
[Hamilton](https://github.com/dagworks-inc/hamilton) development, designed to 
help you build, debug, and optimize Hamilton DAGs using Claude Code.
+
+## What is This?
+
+This is a [Claude Code plugin](https://code.claude.com/docs/en/plugins.md) 
that provides expert assistance for Hamilton development. When active, Claude 
Code understands Hamilton's patterns, best practices, and can help you:
+
+- 🏗️ **Create new Hamilton modules** with proper patterns and decorators
+- 🔍 **Understand existing DAGs** by explaining dataflow and dependencies
+- 🎨 **Apply function modifiers** correctly (@parameterize, @config.when, 
@check_output, etc.)
+- 🐛 **Debug issues** in DAG definitions and execution
+- 🔄 **Convert Python scripts** to Hamilton modules
+- ⚡ **Optimize pipelines** with caching, parallelization, and best practices
+- ✅ **Write tests** for Hamilton functions
+- 📊 **Generate visualizations** of your DAGs
+
+## Installation
+
+### Option 1: Install via Plugin System (Recommended for Users)
+
+```bash
+# Add the Hamilton plugin marketplace
+/plugin marketplace add dagworks-inc/hamilton
+
+# Install the plugin
+/plugin install hamilton --scope user
+```
+
+Or in one command:
+```bash
+claude plugin install hamilton@dagworks-inc/hamilton --scope user
+```
+
+**Installation scopes:**
+- `--scope user` - Available in all your projects (recommended)
+- `--scope project` - Only in current project
+- `--scope local` - Testing/development only
+
+### Option 2: For Hamilton Contributors
+
+If you've cloned the Hamilton repository, the skill is already available at 
`.claude/skills/hamilton/` and will be automatically discovered by Claude Code. 
No installation needed!
+
+### Option 3: Manual Installation
+
+Copy the skill to your personal or project skills directory:
+
+```bash
+# Personal (available everywhere)
+cp -r .claude/plugins/hamilton/skills/hamilton ~/.claude/skills/
+
+# Project-specific
+cp -r .claude/plugins/hamilton/skills/hamilton 
/path/to/your/project/.claude/skills/
+```
+
+## Usage
+
+### Automatic Invocation
+
+Claude Code will automatically use this skill when it detects you're working 
with Hamilton code. Just ask questions or give instructions naturally:
+
+```
+"Help me create a Hamilton module for processing customer data"
+"Explain what this DAG does"
+"Convert this pandas script to Hamilton"
+"Add caching to my expensive computation function"
+"Why am I getting a circular dependency error?"
+```
+
+### Manual Invocation
+
+You can explicitly invoke the skill using the `/hamilton` command:
+
+```
+/hamilton create a feature engineering module with rolling averages
+/hamilton explain the dataflow in my_functions.py
+/hamilton optimize this DAG for parallel execution
+```
+
+## What the Skill Knows
+
+This skill has deep knowledge of:
+
+- **Core Hamilton concepts**: Drivers, DAGs, nodes, function-based definitions
+- **Function modifiers**: All decorators (@parameterize, @config.when, 
@extract_columns, @check_output, @save_to, @load_from, @cache, @pipe, @does, 
@mutate, @step, etc.)
+- **Execution patterns**: Sequential, parallel, distributed (Ray, Dask, Spark)
+- **Data quality**: Validation, schema checking, data quality pipelines
+- **I/O patterns**: Materialization, data loaders, result adapters
+- **Integration patterns**: Airflow, Streamlit, FastAPI, Jupyter
+- **LLM workflows**: RAG pipelines, document processing, embeddings
+- **Testing strategies**: Unit testing functions, integration testing DAGs
+- **Debugging techniques**: Circular dependencies, visualization, lineage 
tracing
+
+## Examples
+
+### Creating a New Hamilton Module
+
+```
+"Create a Hamilton module that loads data from a CSV, cleans it by removing
+nulls, calculates a 7-day rolling average of the 'sales' column, and outputs
+the top 10 days by sales."
+```
+
+Claude will generate:
+- Properly structured functions with type hints
+- Correct dependency declarations via parameters
+- Appropriate docstrings
+- Driver setup code
+- Suggestions for visualization
+
+### Converting Existing Code
+
+```
+"Convert this script to Hamilton:
+
+import pandas as pd
+df = pd.read_csv('data.csv')
+df['feature'] = df['col_a'] * 2 + df['col_b']
+result = df.groupby('category')['feature'].mean()
+"
+```
+
+Claude will refactor it into a clean Hamilton module with separate functions 
for each transformation step.
+
+### Applying Decorators
+
+```
+"I need to create rolling averages for 7, 30, and 90 day windows.
+How do I do this in Hamilton without repeating code?"
+```
+
+Claude will show you how to use `@parameterize` to create multiple nodes from 
a single function.
+
+### Debugging
+
+```
+"I'm getting an error: 'Could not find parameter 'processed_data' in graph'.
+What's wrong?"
+```
+
+Claude will help identify the issue (likely a typo or missing function 
definition) and suggest fixes.
+
+## Skill Features
+
+### Allowed Tools
+
+This skill is configured with permissions to:
+- Read files (`Read`, `Grep`, `Glob`)
+- Run Python code (`Bash(python:*)`)
+- Search for files (`Bash(find:*)`)
+- Run tests (`Bash(pytest:*)`)
+
+These tools are automatically permitted when the skill is active, streamlining 
the workflow.
+
+### Reference Materials
+
+The skill includes additional reference files:
+
+- **[examples.md](skills/hamilton/examples.md)** - Comprehensive code examples 
for common patterns
+  - Basic DAG creation
+  - Advanced function modifiers
+  - LLM & RAG workflows
+  - Feature engineering patterns
+  - Data quality validation
+  - Parallel execution
+  - Integration patterns (Airflow, FastAPI, Streamlit)
+
+## Requirements
+
+- **Claude Code CLI** - Install from https://code.claude.com
+- **Hamilton** - The skill works with any version, but references Hamilton 
1.x+ patterns
+- **Python 3.9+** - For running generated Hamilton code
+
+## Contributing
+
+This plugin is open source and part of the Hamilton project! We welcome 
contributions:
+
+### Found a Bug?
+
+Please [file an issue](https://github.com/dagworks-inc/hamilton/issues/new) on 
GitHub with:
+- A clear description of the problem
+- Steps to reproduce
+- Expected vs actual behavior
+- Your Hamilton and Claude Code versions
+
+### Want to Improve It?
+
+Even better - submit a pull request!
+
+1. **Fork the repository**: https://github.com/dagworks-inc/hamilton
+2. **Make your changes**: Edit files in `.claude/skills/hamilton/` or 
`.claude/plugins/hamilton/`
+3. **Test thoroughly**: Try the skill with various Hamilton scenarios
+4. **Submit a PR**: Include a clear description of your improvements
+
+**Types of contributions we love:**
+- 📚 Add new examples to `examples.md`
+- 📝 Improve instructions in `SKILL.md`
+- 🐛 Fix bugs or inaccuracies
+- ✨ Add support for new Hamilton features
+- 📖 Enhance documentation
+
+See [CONTRIBUTING.md](../../../CONTRIBUTING.md) in the Hamilton repo for 
detailed guidelines.
+
+## Philosophy
+
+This skill follows Hamilton's core philosophy:
+
+- **Declarative over imperative**: Guide users toward function-based 
definitions
+- **Separation of concerns**: Keep definition, execution, and observation 
separate
+- **Reusability**: Encourage patterns that make code testable and portable
+- **Simplicity**: Prefer simple solutions over over-engineering
+
+## Changelog
+
+### v1.0.0 (2025-01-31)
+- Initial release
+- Comprehensive Hamilton DAG creation assistance
+- Support for all major function modifiers
+- LLM/RAG workflow patterns
+- Feature engineering examples
+- Data quality validation patterns
+- Integration examples (Airflow, FastAPI, Streamlit)
+
+## Learn More
+
+- **Hamilton Documentation**: https://hamilton.dagworks.io
+- **GitHub Repository**: https://github.com/dagworks-inc/hamilton
+- **Hamilton Examples**: See `examples/` directory in the repo (60+ production 
examples)
+- **DAGWorks Blog**: https://blog.dagworks.io
+- **Community Slack**: Join via Hamilton GitHub repo
+
+## License
+
+This plugin is part of the Hamilton project and is licensed under the Apache 
2.0 License.
+
+---
+
+**Happy Hamilton coding with Claude! 🚀**
diff --git a/.claude/plugins/hamilton/skills/hamilton/README.md 
b/.claude/plugins/hamilton/skills/hamilton/README.md
new file mode 100644
index 00000000..3407c22f
--- /dev/null
+++ b/.claude/plugins/hamilton/skills/hamilton/README.md
@@ -0,0 +1,232 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Hamilton Claude Code Skill
+
+A comprehensive AI assistant skill for 
[Hamilton](https://github.com/dagworks-inc/hamilton) development, designed to 
help you build, debug, and optimize Hamilton DAGs using Claude Code.
+
+## What is This?
+
+This is a [Claude Code skill](https://code.claude.com/docs/en/skills.md) that 
provides expert assistance for Hamilton development. When active, Claude Code 
understands Hamilton's patterns, best practices, and can help you:
+
+- 🏗️ **Create new Hamilton modules** with proper patterns and decorators
+- 🔍 **Understand existing DAGs** by explaining dataflow and dependencies
+- 🎨 **Apply function modifiers** correctly (@parameterize, @config.when, 
@check_output, etc.)
+- 🐛 **Debug issues** in DAG definitions and execution
+- 🔄 **Convert Python scripts** to Hamilton modules
+- ⚡ **Optimize pipelines** with caching, parallelization, and best practices
+- ✅ **Write tests** for Hamilton functions
+- 📊 **Generate visualizations** of your DAGs
+
+## Installation
+
+### For Hamilton Contributors (Automatic)
+
+If you've cloned the Hamilton repository, the skill is already available! It's 
located in `.claude/skills/hamilton/` and will be automatically discovered by 
Claude Code.
+
+### For Hamilton Users (Manual Installation)
+
+To use this skill in your own Hamilton projects:
+
+#### Option 1: Copy to Your Project (Project-Scoped)
+
+```bash
+# In your Hamilton project directory
+mkdir -p .claude/skills
+cp -r /path/to/hamilton/.claude/skills/hamilton .claude/skills/
+
+# Commit to version control so your team gets it too
+git add .claude/skills/hamilton
+git commit -m "Add Hamilton Claude Code skill"
+```
+
+#### Option 2: Install Globally (Available Everywhere)
+
+```bash
+# Install to your personal Claude Code skills directory
+mkdir -p ~/.claude/skills
+cp -r /path/to/hamilton/.claude/skills/hamilton ~/.claude/skills/
+
+# Now available in all your projects!
+```
+
+#### Option 3: Symlink (For Active Development)
+
+```bash
+# Create symlink to stay in sync with Hamilton repo updates
+ln -s /path/to/hamilton/.claude/skills/hamilton ~/.claude/skills/hamilton
+```
+
+## Usage
+
+### Automatic Invocation
+
+Claude Code will automatically use this skill when it detects you're working 
with Hamilton code. Just ask questions or give instructions naturally:
+
+```
+"Help me create a Hamilton module for processing customer data"
+"Explain what this DAG does"
+"Convert this pandas script to Hamilton"
+"Add caching to my expensive computation function"
+"Why am I getting a circular dependency error?"
+```
+
+### Manual Invocation
+
+You can explicitly invoke the skill using the `/hamilton` command:
+
+```
+/hamilton create a feature engineering module with rolling averages
+/hamilton explain the dataflow in my_functions.py
+/hamilton optimize this DAG for parallel execution
+```
+
+## What the Skill Knows
+
+This skill has deep knowledge of:
+
+- **Core Hamilton concepts**: Drivers, DAGs, nodes, function-based definitions
+- **Function modifiers**: All decorators (@parameterize, @config.when, 
@extract_columns, @check_output, @save_to, @load_from, @cache, @pipe, @does, 
@mutate, @step, etc.)
+- **Execution patterns**: Sequential, parallel, distributed (Ray, Dask, Spark)
+- **Data quality**: Validation, schema checking, data quality pipelines
+- **I/O patterns**: Materialization, data loaders, result adapters
+- **Integration patterns**: Airflow, Streamlit, FastAPI, Jupyter
+- **LLM workflows**: RAG pipelines, document processing, embeddings
+- **Testing strategies**: Unit testing functions, integration testing DAGs
+- **Debugging techniques**: Circular dependencies, visualization, lineage 
tracing
+
+## Examples
+
+### Creating a New Hamilton Module
+
+```
+"Create a Hamilton module that loads data from a CSV, cleans it by removing
+nulls, calculates a 7-day rolling average of the 'sales' column, and outputs
+the top 10 days by sales."
+```
+
+Claude will generate:
+- Properly structured functions with type hints
+- Correct dependency declarations via parameters
+- Appropriate docstrings
+- Driver setup code
+- Suggestions for visualization
+
+### Converting Existing Code
+
+```
+"Convert this script to Hamilton:
+
+import pandas as pd
+df = pd.read_csv('data.csv')
+df['feature'] = df['col_a'] * 2 + df['col_b']
+result = df.groupby('category')['feature'].mean()
+"
+```
+
+Claude will refactor it into a clean Hamilton module with separate functions 
for each transformation step.
+
+### Applying Decorators
+
+```
+"I need to create rolling averages for 7, 30, and 90 day windows.
+How do I do this in Hamilton without repeating code?"
+```
+
+Claude will show you how to use `@parameterize` to create multiple nodes from 
a single function.
+
+### Debugging
+
+```
+"I'm getting an error: 'Could not find parameter 'processed_data' in graph'.
+What's wrong?"
+```
+
+Claude will help identify the issue (likely a typo or missing function 
definition) and suggest fixes.
+
+## Skill Features
+
+### Allowed Tools
+
+This skill is configured with permissions to:
+- Read files (`Read`, `Grep`, `Glob`)
+- Run Python code (`Bash(python:*)`)
+- Search for files (`Bash(find:*)`)
+- Run tests (`Bash(pytest:*)`)
+
+These tools are automatically permitted when the skill is active, streamlining 
the workflow.
+
+### Reference Materials
+
+The skill includes additional reference files:
+
+- **[examples.md](examples.md)** - Comprehensive code examples for common 
patterns
+  - Basic DAG creation
+  - Advanced function modifiers
+  - LLM & RAG workflows
+  - Feature engineering patterns
+  - Data quality validation
+  - Parallel execution
+  - Integration patterns (Airflow, FastAPI, Streamlit)
+
+## Requirements
+
+- **Claude Code CLI** - Install from https://code.claude.com
+- **Hamilton** - The skill works with any version, but references Hamilton 
1.x+ patterns
+- **Python 3.9+** - For running generated Hamilton code
+
+## Contributing
+
+This skill is open source and part of the Hamilton project! Contributions are 
welcome:
+
+1. **Improve examples**: Add new patterns to `examples.md`
+2. **Enhance instructions**: Make `SKILL.md` more helpful
+3. **Add references**: Include links to helpful Hamilton documentation
+4. **Report issues**: Let us know what works and what doesn't
+
+See [CONTRIBUTING.md](../../../CONTRIBUTING.md) in the Hamilton repo for 
guidelines.
+
+## Philosophy
+
+This skill follows Hamilton's core philosophy:
+
+- **Declarative over imperative**: Guide users toward function-based 
definitions
+- **Separation of concerns**: Keep definition, execution, and observation 
separate
+- **Reusability**: Encourage patterns that make code testable and portable
+- **Simplicity**: Prefer simple solutions over over-engineering
+
+## Learn More
+
+- **Hamilton Documentation**: https://hamilton.dagworks.io
+- **GitHub Repository**: https://github.com/dagworks-inc/hamilton
+- **Hamilton Examples**: See `examples/` directory in the repo (60+ production 
examples)
+- **DAGWorks Blog**: https://blog.dagworks.io
+- **Community Slack**: Join via Hamilton GitHub repo
+
+## Version
+
+This skill is maintained alongside Hamilton and evolves with the framework. 
Last updated: January 2025
+
+## License
+
+This skill is part of the Hamilton project and is licensed under the same 
terms (Apache 2.0 License).
+
+---
+
+**Happy Hamilton coding with Claude! 🚀**
diff --git a/.claude/plugins/hamilton/skills/hamilton/SKILL.md 
b/.claude/plugins/hamilton/skills/hamilton/SKILL.md
new file mode 100644
index 00000000..22841cc1
--- /dev/null
+++ b/.claude/plugins/hamilton/skills/hamilton/SKILL.md
@@ -0,0 +1,474 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+---
+name: hamilton
+description: Expert assistant for Hamilton DAG development. Use when creating 
Hamilton modules, applying decorators, visualizing DAGs, debugging dataflows, 
converting Python code to Hamilton patterns, or optimizing Hamilton pipelines.
+allowed-tools: Read, Grep, Glob, Bash(python:*), Bash(find:*), Bash(pytest:*)
+user-invocable: true
+disable-model-invocation: false
+---
+
+# Hamilton Development Assistant
+
+Hamilton is a lightweight Python framework for building Directed Acyclic 
Graphs (DAGs) of data transformations using declarative, function-based 
definitions.
+
+## Core Principles
+
+**Function-Based DAG Definition**
+- Functions with type hints define nodes in the DAG
+- Function parameters automatically create edges (dependencies)
+- Function names become node names in the DAG
+- Pure functions enable easy testing and reusability
+
+**Key Architecture Components**
+- **Functions**: Define transformations with parameters as dependencies
+- **Driver**: Builds and manages DAG execution (`.execute()` runs the DAG)
+- **FunctionGraph**: Internal DAG representation
+- **Function Modifiers**: Decorators that modify DAG behavior (see below)
+- **Adapters**: Result formatters and lifecycle hooks
+
+**Separation of Concerns**
+- **Definition layer**: Pure Python functions (testable, reusable)
+- **Execution layer**: Driver configuration (where/how to run)
+- **Observation layer**: Monitoring, lineage, caching
+
+## Common Tasks
+
+### 1. Creating New Hamilton Modules
+
+When creating a new Hamilton module, follow these patterns:
+
+**Basic Module Structure:**
+```python
+"""
+Module docstring explaining the DAG's purpose.
+"""
+import pandas as pd
+from hamilton.function_modifiers import config, parameterize, extract_columns
+
+def raw_data(data_path: str) -> pd.DataFrame:
+    """Load raw data from source.
+
+    :param data_path: Path to data file (passed as input)
+    :return: Raw DataFrame
+    """
+    return pd.read_csv(data_path)
+
+def cleaned_data(raw_data: pd.DataFrame) -> pd.DataFrame:
+    """Remove null values and duplicates.
+
+    :param raw_data: Raw data from previous node
+    :return: Cleaned DataFrame
+    """
+    return raw_data.dropna().drop_duplicates()
+
+def feature_a(cleaned_data: pd.DataFrame) -> pd.Series:
+    """Calculate feature A.
+
+    :param cleaned_data: Cleaned data
+    :return: Feature A values
+    """
+    return cleaned_data['column_a'] * 2
+```
+
+**Driver Setup:**
+```python
+from hamilton import driver
+import my_functions
+
+dr = driver.Driver({}, my_functions)
+results = dr.execute(
+    ['feature_a', 'cleaned_data'],
+    inputs={'data_path': 'data.csv'}
+)
+```
+
+**Best Practices for New Modules:**
+1. ✅ Add type hints to ALL function signatures
+2. ✅ Write clear docstrings with :param and :return
+3. ✅ Keep functions pure (no side effects unless using @does)
+4. ✅ Name functions after the output they produce
+5. ✅ Use function parameters for dependencies (not globals)
+6. ✅ Create unit tests for each function
+7. ❌ Don't use classes unless needed (functions are preferred)
+8. ❌ Don't mutate inputs (return new objects)
+
+### 2. Applying Function Modifiers (Decorators)
+
+Hamilton's power comes from function modifiers. Here's when to use each:
+
+**Configuration & Polymorphism**
+
+```python
+from hamilton.function_modifiers import config
+
[email protected](model_type='linear')
+def predictions(features: pd.DataFrame) -> pd.Series:
+    """Linear model predictions."""
+    from sklearn.linear_model import LinearRegression
+    model = LinearRegression()
+    return model.fit_predict(features)
+
[email protected](model_type='tree')
+def predictions(features: pd.DataFrame) -> pd.Series:
+    """Tree model predictions."""
+    from sklearn.tree import DecisionTreeRegressor
+    model = DecisionTreeRegressor()
+    return model.fit_predict(features)
+
+# Use: driver.Driver({'model_type': 'linear'}, module)
+```
+
+**Parameterization - Creating Multiple Nodes**
+
+```python
+from hamilton.function_modifiers import parameterize
+
+@parameterize(
+    rolling_7d={'window': 7},
+    rolling_30d={'window': 30},
+    rolling_90d={'window': 90},
+)
+def rolling_average(spend: pd.Series, window: int) -> pd.Series:
+    """Calculate rolling average for different windows."""
+    return spend.rolling(window).mean()
+
+# Creates 3 nodes: rolling_7d, rolling_30d, rolling_90d
+```
+
+**Column Extraction - DataFrames to Series**
+
+```python
+from hamilton.function_modifiers import extract_columns
+
+@extract_columns('feature_1', 'feature_2', 'feature_3')
+def features(cleaned_data: pd.DataFrame) -> pd.DataFrame:
+    """Generate multiple features."""
+    return pd.DataFrame({
+        'feature_1': cleaned_data['a'] * 2,
+        'feature_2': cleaned_data['b'] ** 2,
+        'feature_3': cleaned_data['a'] + cleaned_data['b'],
+    })
+
+# Creates 3 nodes: feature_1, feature_2, feature_3 (each a Series)
+```
+
+**Data Quality Validation**
+
+```python
+from hamilton.function_modifiers import check_output
+import pandera as pa
+
+@check_output(
+    data_type=float,
+    range=(0, 100),
+    importance="fail"  # or "warn"
+)
+def revenue_percentage(revenue: float, total: float) -> float:
+    """Calculate revenue as percentage."""
+    return (revenue / total) * 100
+
+# With Pandera schemas
+@check_output(
+    schema=pa.SeriesSchema(float, pa.Check.greater_than(0)),
+    importance="fail"
+)
+def positive_values(data: pd.Series) -> pd.Series:
+    """Ensure all values are positive."""
+    return data
+```
+
+**I/O Materialization**
+
+```python
+from hamilton.function_modifiers import save_to, load_from
+from hamilton.io.materialization import to
+
+@save_to(to.csv(path="output.csv"))
+def final_results(aggregated_data: pd.DataFrame) -> pd.DataFrame:
+    """Save final results to CSV."""
+    return aggregated_data
+
+@load_from(from_='data.parquet', reader='parquet')
+def input_data() -> pd.DataFrame:
+    """Load data from parquet."""
+    pass  # Function body ignored when using @load_from
+```
+
+**Transformation Macros**
+
+```python
+from hamilton.function_modifiers import pipe
+
+@pipe(
+    step1=lambda df: df.dropna(),
+    step2=lambda df: df[df['value'] > 0],
+    step3=lambda df: df.reset_index(drop=True)
+)
+def cleaned_pipeline(raw_data: pd.DataFrame) -> pd.DataFrame:
+    """Apply sequential transformations."""
+    return raw_data
+```
+
+### 3. Converting Existing Code to Hamilton
+
+**Before (Script):**
+```python
+import pandas as pd
+
+df = pd.read_csv('data.csv')
+df = df.dropna()
+df['feature'] = df['col_a'] * 2
+result = df.groupby('category')['feature'].mean()
+print(result)
+```
+
+**After (Hamilton Module):**
+```python
+"""Data processing DAG."""
+import pandas as pd
+
+def raw_data(data_path: str) -> pd.DataFrame:
+    """Load raw data."""
+    return pd.read_csv(data_path)
+
+def cleaned_data(raw_data: pd.DataFrame) -> pd.DataFrame:
+    """Remove nulls."""
+    return raw_data.dropna()
+
+def feature(cleaned_data: pd.DataFrame) -> pd.Series:
+    """Calculate feature."""
+    return cleaned_data['col_a'] * 2
+
+def data_with_feature(cleaned_data: pd.DataFrame, feature: pd.Series) -> 
pd.DataFrame:
+    """Add feature to dataset."""
+    df = cleaned_data.copy()
+    df['feature'] = feature
+    return df
+
+def result(data_with_feature: pd.DataFrame) -> pd.Series:
+    """Aggregate by category."""
+    return data_with_feature.groupby('category')['feature'].mean()
+```
+
+**Conversion Guidelines:**
+1. Identify distinct computation steps
+2. Extract each step into a pure function
+3. Use previous step's variable name as function parameter
+4. Add type hints and docstrings
+5. Remove imperative variable assignments
+6. Test each function independently
+
+### 4. Visualizing & Understanding DAGs
+
+**Generate Visualization:**
+```python
+from hamilton import driver
+import my_functions
+
+dr = driver.Driver({}, my_functions)
+
+# Create visualization
+dr.display_all_functions('dag.png')  # All nodes
+dr.visualize_execution(
+    ['final_output'],
+    'execution.png',
+    inputs={'input_data': ...}
+)  # Execution path only
+```
+
+**Understanding DAG Structure:**
+- Each function becomes a node
+- Function parameters create directed edges
+- No cycles allowed (DAG = Directed Acyclic Graph)
+- Execution order determined by dependencies
+- Multiple paths execute in parallel when possible
+
+**Debugging Tips:**
+1. Check for circular dependencies (A depends on B depends on A)
+2. Verify all parameter names match existing function names
+3. Look for typos in parameter names
+4. Use `dr.list_available_variables()` to see all nodes
+5. Check `dr.what_is_downstream_of('node_name')` for dependencies
+
+### 5. Testing Hamilton Functions
+
+**Unit Testing Pattern:**
+```python
+import pytest
+import pandas as pd
+from my_functions import cleaned_data, feature
+
+def test_cleaned_data():
+    """Test data cleaning."""
+    raw = pd.DataFrame({
+        'col_a': [1, 2, None, 4],
+        'col_b': ['a', 'b', 'c', 'd']
+    })
+    result = cleaned_data(raw)
+    assert len(result) == 3
+    assert result['col_a'].isna().sum() == 0
+
+def test_feature():
+    """Test feature calculation."""
+    data = pd.DataFrame({'col_a': [1, 2, 3]})
+    result = feature(data)
+    pd.testing.assert_series_equal(
+        result,
+        pd.Series([2, 4, 6], name='col_a')
+    )
+```
+
+**Integration Testing with Driver:**
+```python
+def test_full_pipeline():
+    """Test complete DAG execution."""
+    from hamilton import driver
+    import my_functions
+
+    dr = driver.Driver({}, my_functions)
+    result = dr.execute(
+        ['result'],
+        inputs={'data_path': 'test_data.csv'}
+    )
+    assert 'result' in result
+    assert result['result'].sum() > 0
+```
+
+### 6. Common Patterns & Examples
+
+**Feature Engineering:**
+```python
+from hamilton.function_modifiers import extract_columns
+
+@extract_columns('mean_price', 'max_price', 'min_price', 'std_price')
+def price_features(prices: pd.Series) -> pd.DataFrame:
+    """Calculate price statistics."""
+    return pd.DataFrame({
+        'mean_price': [prices.mean()],
+        'max_price': [prices.max()],
+        'min_price': [prices.min()],
+        'std_price': [prices.std()],
+    })
+```
+
+**LLM/RAG Workflows:**
+```python
+from typing import List
+
+def chunked_documents(raw_text: str, chunk_size: int = 500) -> List[str]:
+    """Split documents into chunks."""
+    words = raw_text.split()
+    return [
+        ' '.join(words[i:i+chunk_size])
+        for i in range(0, len(words), chunk_size)
+    ]
+
+def embeddings(chunked_documents: List[str], model: str = 
'text-embedding-ada-002') -> List[List[float]]:
+    """Generate embeddings for chunks."""
+    import openai
+    response = openai.Embedding.create(input=chunked_documents, model=model)
+    return [item['embedding'] for item in response['data']]
+```
+
+**Data Quality Pipeline:**
+```python
+from hamilton.function_modifiers import check_output
+import pandera as pa
+
+schema = pa.DataFrameSchema({
+    'user_id': pa.Column(int, pa.Check.greater_than(0)),
+    'amount': pa.Column(float, pa.Check.in_range(0, 10000)),
+    'date': pa.Column(pa.DateTime),
+})
+
+@check_output(schema=schema, importance="fail")
+def validated_transactions(raw_transactions: pd.DataFrame) -> pd.DataFrame:
+    """Validate transaction data."""
+    return raw_transactions
+```
+
+## Key Files & Locations
+
+- **Core library**: `hamilton/` - Main package code
+- **Driver**: `hamilton/driver.py` - Main orchestration class
+- **Function modifiers**: `hamilton/function_modifiers/` - Decorators
+- **Plugins**: `hamilton/plugins/` - 50+ integrations
+- **Examples**: `examples/` - 60+ production examples
+- **Tests**: `tests/` - Unit and integration tests
+- **Docs**: `docs/` - Official documentation
+
+## Common Pitfalls & Solutions
+
+**Circular Dependencies:**
+```python
+# ❌ Bad - circular dependency
+def a(b: int) -> int:
+    return b + 1
+
+def b(a: int) -> int:
+    return a + 1
+
+# ✅ Good - break the cycle
+def a(input_value: int) -> int:
+    return input_value + 1
+
+def b(a: int) -> int:
+    return a + 1
+```
+
+**Missing Type Hints:**
+```python
+# ❌ Bad - no type hints
+def process(data):
+    return data * 2
+
+# ✅ Good - clear types
+def process(data: pd.Series) -> pd.Series:
+    return data * 2
+```
+
+**Mutating Inputs:**
+```python
+# ❌ Bad - mutates input
+def add_column(df: pd.DataFrame, col_name: str) -> pd.DataFrame:
+    df[col_name] = 0  # Modifies original!
+    return df
+
+# ✅ Good - returns new object
+def add_column(df: pd.DataFrame, col_name: str) -> pd.DataFrame:
+    result = df.copy()
+    result[col_name] = 0
+    return result
+```
+
+## Getting Help
+
+- **Documentation**: `docs/` directory in repo
+- **Examples**: `examples/` directory for patterns
+- **Community**: Hamilton Slack, GitHub issues
+- **Blog**: blog.dagworks.io for deep dives
+
+## Additional Resources
+
+For detailed reference material, see:
+- [examples.md](examples.md) - Curated example patterns
+- Hamilton official docs at docs.hamilton.dagworks.io
+- Hamilton GitHub repository examples folder
diff --git a/.claude/plugins/hamilton/skills/hamilton/examples.md 
b/.claude/plugins/hamilton/skills/hamilton/examples.md
new file mode 100644
index 00000000..828729d1
--- /dev/null
+++ b/.claude/plugins/hamilton/skills/hamilton/examples.md
@@ -0,0 +1,610 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Hamilton Code Examples & Patterns
+
+This document provides comprehensive examples for common Hamilton patterns.
+
+## Table of Contents
+
+1. [Basic DAG Creation](#basic-dag-creation)
+2. [Advanced Function Modifiers](#advanced-function-modifiers)
+3. [LLM & RAG Workflows](#llm--rag-workflows)
+4. [Feature Engineering](#feature-engineering)
+5. [Data Quality & Validation](#data-quality--validation)
+6. [Parallel Execution](#parallel-execution)
+7. [Caching Strategies](#caching-strategies)
+8. [Integration Patterns](#integration-patterns)
+
+---
+
+## Basic DAG Creation
+
+### Hello World Example
+
+**File: `hello_world_functions.py`**
+```python
+"""Simple Hamilton DAG for data processing."""
+import pandas as pd
+
+def raw_data(csv_path: str) -> pd.DataFrame:
+    """Load data from CSV file.
+
+    :param csv_path: Path to CSV file (external input)
+    """
+    return pd.read_csv(csv_path)
+
+def avg_age(raw_data: pd.DataFrame) -> float:
+    """Calculate average age.
+
+    :param raw_data: Raw data DataFrame
+    """
+    return raw_data['age'].mean()
+
+def filtered_data(raw_data: pd.DataFrame, min_age: int = 18) -> pd.DataFrame:
+    """Filter data by minimum age.
+
+    :param raw_data: Raw data DataFrame
+    :param min_age: Minimum age threshold (config/input)
+    """
+    return raw_data[raw_data['age'] >= min_age]
+```
+
+**File: `run.py`**
+```python
+from hamilton import driver
+import hello_world_functions
+
+# Create driver
+dr = driver.Driver(
+    config={'min_age': 21},  # Config values
+    module=hello_world_functions
+)
+
+# Execute DAG
+results = dr.execute(
+    final_vars=['avg_age', 'filtered_data'],
+    inputs={'csv_path': 'data.csv'}  # External inputs
+)
+
+print(f"Average age: {results['avg_age']}")
+print(f"Filtered rows: {len(results['filtered_data'])}")
+```
+
+---
+
+## Advanced Function Modifiers
+
+### Parameterization Patterns
+
+**Multiple Time Windows:**
+```python
+from hamilton.function_modifiers import parameterize
+
+@parameterize(
+    daily_sum={'window': '1D', 'agg': 'sum'},
+    weekly_avg={'window': '7D', 'agg': 'mean'},
+    monthly_max={'window': '30D', 'agg': 'max'},
+)
+def time_aggregation(
+    sales: pd.Series,
+    window: str,
+    agg: str
+) -> pd.Series:
+    """Aggregate sales over different time windows.
+
+    Creates nodes: daily_sum, weekly_avg, monthly_max
+    """
+    return sales.rolling(window).agg(agg)
+```
+
+**Multiple Data Sources:**
+```python
+from hamilton.function_modifiers import parameterize_sources
+
+@parameterize_sources(
+    revenue_per_user=dict(metric='revenue', denominator='users'),
+    cost_per_signup=dict(metric='cost', denominator='signups'),
+    profit_per_order=dict(metric='profit', denominator='orders'),
+)
+def rate_calculation(metric: pd.Series, denominator: pd.Series) -> pd.Series:
+    """Calculate various rate metrics."""
+    return metric / denominator
+```
+
+### Configuration-Based Polymorphism
+
+**Model Selection:**
+```python
+from hamilton.function_modifiers import config
+from sklearn.linear_model import LinearRegression, Ridge, Lasso
+
[email protected](model='linear')
+def trained_model__linear(X_train: pd.DataFrame, y_train: pd.Series) -> 
LinearRegression:
+    """Train linear regression model."""
+    model = LinearRegression()
+    model.fit(X_train, y_train)
+    return model
+
[email protected](model='ridge')
+def trained_model__ridge(X_train: pd.DataFrame, y_train: pd.Series) -> Ridge:
+    """Train ridge regression model."""
+    model = Ridge(alpha=1.0)
+    model.fit(X_train, y_train)
+    return model
+
[email protected](model='lasso')
+def trained_model__lasso(X_train: pd.DataFrame, y_train: pd.Series) -> Lasso:
+    """Train lasso regression model."""
+    model = Lasso(alpha=0.1)
+    model.fit(X_train, y_train)
+    return model
+
+# All create same node: trained_model
+# Selected by config: driver.Driver({'model': 'ridge'}, module)
+```
+
+**Environment-Specific Behavior:**
+```python
[email protected](environment='development')
+def data_source__dev() -> str:
+    """Development database connection."""
+    return "sqlite:///dev.db"
+
[email protected](environment='production')
+def data_source__prod() -> str:
+    """Production database connection."""
+    return "postgresql://prod-server/db"
+```
+
+### Column Extraction
+
+**Feature Generation:**
+```python
+from hamilton.function_modifiers import extract_columns
+
+@extract_columns('price_mean', 'price_median', 'price_std', 'price_min', 
'price_max')
+def price_statistics(transactions: pd.DataFrame) -> pd.DataFrame:
+    """Generate price statistics.
+
+    Creates 5 separate nodes, each containing a single series.
+    """
+    prices = transactions['price']
+    return pd.DataFrame({
+        'price_mean': [prices.mean()],
+        'price_median': [prices.median()],
+        'price_std': [prices.std()],
+        'price_min': [prices.min()],
+        'price_max': [prices.max()],
+    })
+```
+
+---
+
+## LLM & RAG Workflows
+
+### Document Processing Pipeline
+
+```python
+from typing import List, Dict
+import openai
+from hamilton.function_modifiers import parameterize, extract_columns
+
+def document_text(pdf_path: str) -> str:
+    """Extract text from PDF."""
+    import PyPDF2
+    with open(pdf_path, 'rb') as f:
+        reader = PyPDF2.PdfReader(f)
+        return '\n'.join(page.extract_text() for page in reader.pages)
+
+def chunks(document_text: str, chunk_size: int = 1000, overlap: int = 100) -> 
List[str]:
+    """Split document into overlapping chunks."""
+    chunks = []
+    start = 0
+    while start < len(document_text):
+        end = start + chunk_size
+        chunks.append(document_text[start:end])
+        start = end - overlap
+    return chunks
+
+def embeddings(chunks: List[str], embedding_model: str = 
'text-embedding-ada-002') -> List[List[float]]:
+    """Generate embeddings for chunks."""
+    response = openai.Embedding.create(
+        input=chunks,
+        model=embedding_model
+    )
+    return [item['embedding'] for item in response['data']]
+
+def vector_store(chunks: List[str], embeddings: List[List[float]]) -> Dict:
+    """Store chunks with embeddings in vector database."""
+    import pinecone
+    index = pinecone.Index('documents')
+
+    vectors = [
+        (f"chunk_{i}", emb, {"text": chunk})
+        for i, (chunk, emb) in enumerate(zip(chunks, embeddings))
+    ]
+    index.upsert(vectors)
+    return {"status": "success", "count": len(vectors)}
+
+def query_results(
+    query: str,
+    embedding_model: str,
+    vector_store: Dict,
+    top_k: int = 5
+) -> List[str]:
+    """Retrieve relevant chunks for query."""
+    import pinecone
+
+    # Generate query embedding
+    query_emb = openai.Embedding.create(
+        input=[query],
+        model=embedding_model
+    )['data'][0]['embedding']
+
+    # Search vector store
+    index = pinecone.Index('documents')
+    results = index.query(query_emb, top_k=top_k, include_metadata=True)
+
+    return [match['metadata']['text'] for match in results['matches']]
+
+def llm_response(query: str, query_results: List[str]) -> str:
+    """Generate response using retrieved context."""
+    context = "\n\n".join(query_results)
+    prompt = f"""Answer the question based on the context below.
+
+Context:
+{context}
+
+Question: {query}
+
+Answer:"""
+
+    response = openai.ChatCompletion.create(
+        model="gpt-4",
+        messages=[{"role": "user", "content": prompt}]
+    )
+    return response.choices[0].message.content
+```
+
+### Multi-Provider Vector Database Pattern
+
+```python
+from hamilton.function_modifiers import config
+
[email protected](vector_db='pinecone')
+def vector_store__pinecone(embeddings: List[List[float]], chunks: List[str]) 
-> object:
+    """Store in Pinecone."""
+    import pinecone
+    index = pinecone.Index('docs')
+    index.upsert([(f"c{i}", e, {"text": c}) for i, (e, c) in 
enumerate(zip(embeddings, chunks))])
+    return index
+
[email protected](vector_db='weaviate')
+def vector_store__weaviate(embeddings: List[List[float]], chunks: List[str]) 
-> object:
+    """Store in Weaviate."""
+    import weaviate
+    client = weaviate.Client("http://localhost:8080";)
+    # Weaviate-specific logic
+    return client
+
[email protected](vector_db='lancedb')
+def vector_store__lancedb(embeddings: List[List[float]], chunks: List[str]) -> 
object:
+    """Store in LanceDB."""
+    import lancedb
+    db = lancedb.connect("./lancedb")
+    # LanceDB-specific logic
+    return db
+```
+
+---
+
+## Feature Engineering
+
+### Time Series Features
+
+```python
+import pandas as pd
+from hamilton.function_modifiers import parameterize
+
+def timestamps(data: pd.DataFrame) -> pd.Series:
+    """Convert to datetime."""
+    return pd.to_datetime(data['timestamp'])
+
+@parameterize(
+    hour_of_day={'freq': 'hour'},
+    day_of_week={'freq': 'dayofweek'},
+    month_of_year={'freq': 'month'},
+    is_weekend={'freq': 'dayofweek', 'transform': lambda x: x >= 5}
+)
+def time_features(timestamps: pd.Series, freq: str, transform=None) -> 
pd.Series:
+    """Extract time-based features."""
+    result = getattr(timestamps.dt, freq)
+    return transform(result) if transform else result
+
+def rolling_mean_7d(values: pd.Series) -> pd.Series:
+    """7-day rolling average."""
+    return values.rolling(window=7, min_periods=1).mean()
+
+def rolling_std_7d(values: pd.Series) -> pd.Series:
+    """7-day rolling standard deviation."""
+    return values.rolling(window=7, min_periods=1).std()
+
+def lag_1d(values: pd.Series) -> pd.Series:
+    """Previous day's value."""
+    return values.shift(1)
+
+def diff_1d(values: pd.Series) -> pd.Series:
+    """Day-over-day difference."""
+    return values.diff(1)
+```
+
+### Categorical Feature Engineering
+
+```python
+from sklearn.preprocessing import LabelEncoder, OneHotEncoder
+import pandas as pd
+
+def label_encoded_category(category_column: pd.Series) -> pd.Series:
+    """Label encode categorical variable."""
+    encoder = LabelEncoder()
+    return pd.Series(encoder.fit_transform(category_column))
+
+def one_hot_encoded_categories(category_column: pd.Series) -> pd.DataFrame:
+    """One-hot encode categorical variable."""
+    return pd.get_dummies(category_column, prefix='cat')
+
+def frequency_encoded_category(category_column: pd.Series) -> pd.Series:
+    """Frequency encoding for high-cardinality categories."""
+    freq_map = category_column.value_counts(normalize=True).to_dict()
+    return category_column.map(freq_map)
+```
+
+---
+
+## Data Quality & Validation
+
+### Schema Validation with Pandera
+
+```python
+from hamilton.function_modifiers import check_output
+import pandera as pa
+
+# Define schema
+transaction_schema = pa.DataFrameSchema({
+    'transaction_id': pa.Column(str, pa.Check.str_length(min_value=10, 
max_value=20)),
+    'user_id': pa.Column(int, pa.Check.greater_than(0)),
+    'amount': pa.Column(float, pa.Check.in_range(0, 100000)),
+    'timestamp': pa.Column(pa.DateTime),
+    'status': pa.Column(str, pa.Check.isin(['pending', 'completed', 
'failed'])),
+})
+
+@check_output(schema=transaction_schema, importance="fail")
+def validated_transactions(raw_transactions: pd.DataFrame) -> pd.DataFrame:
+    """Validate transaction data schema."""
+    return raw_transactions
+
+# Custom validators
+@check_output(
+    data_type=float,
+    range=(0, 1),
+    allow_nan=False,
+    importance="fail"
+)
+def probability_score(model_output: pd.Series) -> pd.Series:
+    """Ensure scores are valid probabilities."""
+    return model_output
+```
+
+### Custom Data Quality Checks
+
+```python
+from hamilton.function_modifiers import check_output_custom
+
+def no_duplicates_check(df: pd.DataFrame) -> bool:
+    """Check for duplicate rows."""
+    return not df.duplicated().any()
+
+def required_columns_check(df: pd.DataFrame, required: list) -> bool:
+    """Check all required columns exist."""
+    return all(col in df.columns for col in required)
+
+@check_output_custom(
+    no_duplicates_check,
+    lambda df: required_columns_check(df, ['id', 'name', 'value']),
+    importance="warn"
+)
+def cleaned_data(raw_data: pd.DataFrame) -> pd.DataFrame:
+    """Clean data with quality checks."""
+    return raw_data.drop_duplicates()
+```
+
+---
+
+## Parallel Execution
+
+### ThreadPoolExecutor Pattern
+
+```python
+from hamilton import driver
+from hamilton.execution import executors
+import my_functions
+
+# Create driver with parallel executor
+dr = driver.Builder()\
+    .with_modules(my_functions)\
+    .with_local_executor(executors.MultiThreadingExecutor(max_tasks=4))\
+    .build()
+
+results = dr.execute(['output1', 'output2'])
+```
+
+### Ray Distributed Execution
+
+```python
+from hamilton.plugins import h_ray
+import ray
+
+ray.init()
+
+# Use Ray executor for distributed processing
+ray_executor = h_ray.RayGraphAdapter(result_builder={"base": dict})
+
+dr = driver.Driver(
+    {},
+    my_functions,
+    adapter=ray_executor
+)
+
+results = dr.execute(['large_computation'], inputs={...})
+```
+
+---
+
+## Caching Strategies
+
+### Function-Level Caching
+
+```python
+from hamilton.function_modifiers import cache
+
+@cache(path="./cache")
+def expensive_computation(large_dataset: pd.DataFrame) -> pd.DataFrame:
+    """Cache results to disk."""
+    # Expensive operation
+    return large_dataset.apply(complex_transformation)
+```
+
+### Disk-Based Caching with DuckDB
+
+```python
+from hamilton.function_modifiers import save_to, load_from
+from hamilton.io.materialization import to
+
+@save_to(to.parquet(path="intermediate_results.parquet"))
+def intermediate_step(raw_data: pd.DataFrame) -> pd.DataFrame:
+    """Save intermediate results."""
+    return raw_data.groupby('category').sum()
+
+# Later in pipeline
+@load_from(from_='intermediate_results.parquet', reader='parquet')
+def cached_intermediate() -> pd.DataFrame:
+    """Load cached intermediate results."""
+    pass
+```
+
+---
+
+## Integration Patterns
+
+### Airflow Integration
+
+```python
+from airflow import DAG
+from airflow.operators.python import PythonOperator
+from hamilton import driver
+import my_hamilton_functions
+
+def run_hamilton_dag(**context):
+    """Execute Hamilton DAG within Airflow task."""
+    dr = driver.Driver({}, my_hamilton_functions)
+    results = dr.execute(['final_output'], inputs=context['params'])
+    return results
+
+with DAG('hamilton_pipeline', schedule_interval='@daily') as dag:
+    task = PythonOperator(
+        task_id='run_hamilton',
+        python_callable=run_hamilton_dag,
+        params={'data_date': '{{ ds }}'}
+    )
+```
+
+### FastAPI Integration
+
+```python
+from fastapi import FastAPI
+from hamilton import driver
+import prediction_functions
+
+app = FastAPI()
+dr = driver.Driver({}, prediction_functions)
+
[email protected]("/predict")
+def predict(features: dict):
+    """Stateless prediction endpoint."""
+    result = dr.execute(['prediction'], inputs=features)
+    return {"prediction": result['prediction']}
+```
+
+### Streamlit Dashboard
+
+```python
+import streamlit as st
+from hamilton import driver
+import dashboard_functions
+
+st.title("Data Dashboard")
+
+# User inputs
+date_range = st.date_input("Select date range", [])
+metric = st.selectbox("Metric", ["revenue", "users", "conversions"])
+
+# Execute Hamilton DAG
+dr = driver.Driver({'metric': metric}, dashboard_functions)
+results = dr.execute(['visualization_data'], inputs={'date_range': date_range})
+
+# Display results
+st.line_chart(results['visualization_data'])
+```
+
+---
+
+## Advanced Patterns
+
+### Subdag Pattern
+
+```python
+from hamilton.function_modifiers import subdag
+
+@subdag(
+    module=feature_engineering,
+    outputs={'scaled_features': 'features'},
+    config={'scaler': 'standard'}
+)
+def processed_features(input_data: pd.DataFrame) -> pd.DataFrame:
+    """Apply feature engineering subdag."""
+    pass
+```
+
+### Dynamic Execution
+
+```python
+from hamilton import driver
+import dynamic_functions
+
+dr = driver.Driver({}, dynamic_functions)
+
+# Execute with runtime parameters
+results = dr.execute(
+    ['final_output'],
+    inputs={'param_list': ['a', 'b', 'c']},  # Expands at runtime
+)
+```
+
+This covers the most common Hamilton patterns. Refer to the main SKILL.md for 
additional guidance and the `examples/` directory in the Hamilton repository 
for more production examples.
diff --git a/.claude/settings.local.json b/.claude/settings.local.json
new file mode 100644
index 00000000..3dcaf067
--- /dev/null
+++ b/.claude/settings.local.json
@@ -0,0 +1,31 @@
+{
+  "permissions": {
+    "allow": [
+      "Bash(python:*)",
+      "Bash(find:*)",
+      "Bash(cat:*)",
+      "Bash(git restore:*)",
+      "Bash(tee:*)",
+      "Bash(echo:*)",
+      "Bash(python3:*)",
+      "Bash(pytest:*)",
+      "Bash(uv sync:*)",
+      "Bash(source:*)",
+      "Bash(pip install:*)",
+      "Bash(pip show:*)",
+      "Bash(/Users/skrawczyk/salesforce/hamilton/.venv/bin/pip install:*)",
+      "Bash(uv pip install:*)",
+      "Bash(VIRTUAL_ENV=/Users/skrawczyk/salesforce/hamilton/.venv uv pip 
install:*)",
+      "Bash(VIRTUAL_ENV=/Users/skrawczyk/salesforce/hamilton/.venv uv run 
pytest:*)",
+      "Bash(VIRTUAL_ENV=/Users/skrawczyk/salesforce/hamilton/.venv 
.venv/bin/python -m pytest:*)",
+      "Bash(VIRTUAL_ENV=/Users/skrawczyk/salesforce/hamilton/.venv 
../../.venv/bin/python:*)",
+      "Bash(VIRTUAL_ENV=/Users/skrawczyk/salesforce/hamilton/.venv 
.venv/bin/python:*)",
+      "Bash(VIRTUAL_ENV=/Users/skrawczyk/salesforce/hamilton/.venv uv pip:*)",
+      "Bash(gh run view:*)",
+      "Bash(grep:*)",
+      "Bash(gh run list:*)",
+      "Bash(curl:*)",
+      "Bash(ls:*)"
+    ]
+  }
+}
diff --git a/.claude/skills/hamilton/README.md 
b/.claude/skills/hamilton/README.md
new file mode 100644
index 00000000..a0b6b659
--- /dev/null
+++ b/.claude/skills/hamilton/README.md
@@ -0,0 +1,251 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Hamilton Claude Code Skill
+
+A comprehensive AI assistant skill for 
[Hamilton](https://github.com/dagworks-inc/hamilton) development, designed to 
help you build, debug, and optimize Hamilton DAGs using Claude Code.
+
+## What is This?
+
+This is a [Claude Code skill](https://code.claude.com/docs/en/skills.md) that 
provides expert assistance for Hamilton development. When active, Claude Code 
understands Hamilton's patterns, best practices, and can help you:
+
+- 🏗️ **Create new Hamilton modules** with proper patterns and decorators
+- 🔍 **Understand existing DAGs** by explaining dataflow and dependencies
+- 🎨 **Apply function modifiers** correctly (@parameterize, @config.when, 
@check_output, etc.)
+- 🐛 **Debug issues** in DAG definitions and execution
+- 🔄 **Convert Python scripts** to Hamilton modules
+- ⚡ **Optimize pipelines** with caching, parallelization, and best practices
+- ✅ **Write tests** for Hamilton functions
+- 📊 **Generate visualizations** of your DAGs
+
+## Installation
+
+### For Hamilton Contributors (Automatic)
+
+If you've cloned the Hamilton repository, the skill is already available! It's 
located in `.claude/skills/hamilton/` and will be automatically discovered by 
Claude Code.
+
+### For Hamilton Users (Manual Installation)
+
+To use this skill in your own Hamilton projects:
+
+#### Option 1: Copy to Your Project (Project-Scoped)
+
+```bash
+# In your Hamilton project directory
+mkdir -p .claude/skills
+cp -r /path/to/hamilton/.claude/skills/hamilton .claude/skills/
+
+# Commit to version control so your team gets it too
+git add .claude/skills/hamilton
+git commit -m "Add Hamilton Claude Code skill"
+```
+
+#### Option 2: Install Globally (Available Everywhere)
+
+```bash
+# Install to your personal Claude Code skills directory
+mkdir -p ~/.claude/skills
+cp -r /path/to/hamilton/.claude/skills/hamilton ~/.claude/skills/
+
+# Now available in all your projects!
+```
+
+#### Option 3: Symlink (For Active Development)
+
+```bash
+# Create symlink to stay in sync with Hamilton repo updates
+ln -s /path/to/hamilton/.claude/skills/hamilton ~/.claude/skills/hamilton
+```
+
+## Usage
+
+### Automatic Invocation
+
+Claude Code will automatically use this skill when it detects you're working 
with Hamilton code. Just ask questions or give instructions naturally:
+
+```
+"Help me create a Hamilton module for processing customer data"
+"Explain what this DAG does"
+"Convert this pandas script to Hamilton"
+"Add caching to my expensive computation function"
+"Why am I getting a circular dependency error?"
+```
+
+### Manual Invocation
+
+You can explicitly invoke the skill using the `/hamilton` command:
+
+```
+/hamilton create a feature engineering module with rolling averages
+/hamilton explain the dataflow in my_functions.py
+/hamilton optimize this DAG for parallel execution
+```
+
+## What the Skill Knows
+
+This skill has deep knowledge of:
+
+- **Core Hamilton concepts**: Drivers, DAGs, nodes, function-based definitions
+- **Function modifiers**: All decorators (@parameterize, @config.when, 
@extract_columns, @check_output, @save_to, @load_from, @cache, @pipe, @does, 
@mutate, @step, etc.)
+- **Execution patterns**: Sequential, parallel, distributed (Ray, Dask, Spark)
+- **Data quality**: Validation, schema checking, data quality pipelines
+- **I/O patterns**: Materialization, data loaders, result adapters
+- **Integration patterns**: Airflow, Streamlit, FastAPI, Jupyter
+- **LLM workflows**: RAG pipelines, document processing, embeddings
+- **Testing strategies**: Unit testing functions, integration testing DAGs
+- **Debugging techniques**: Circular dependencies, visualization, lineage 
tracing
+
+## Examples
+
+### Creating a New Hamilton Module
+
+```
+"Create a Hamilton module that loads data from a CSV, cleans it by removing
+nulls, calculates a 7-day rolling average of the 'sales' column, and outputs
+the top 10 days by sales."
+```
+
+Claude will generate:
+- Properly structured functions with type hints
+- Correct dependency declarations via parameters
+- Appropriate docstrings
+- Driver setup code
+- Suggestions for visualization
+
+### Converting Existing Code
+
+```
+"Convert this script to Hamilton:
+
+import pandas as pd
+df = pd.read_csv('data.csv')
+df['feature'] = df['col_a'] * 2 + df['col_b']
+result = df.groupby('category')['feature'].mean()
+"
+```
+
+Claude will refactor it into a clean Hamilton module with separate functions 
for each transformation step.
+
+### Applying Decorators
+
+```
+"I need to create rolling averages for 7, 30, and 90 day windows.
+How do I do this in Hamilton without repeating code?"
+```
+
+Claude will show you how to use `@parameterize` to create multiple nodes from 
a single function.
+
+### Debugging
+
+```
+"I'm getting an error: 'Could not find parameter 'processed_data' in graph'.
+What's wrong?"
+```
+
+Claude will help identify the issue (likely a typo or missing function 
definition) and suggest fixes.
+
+## Skill Features
+
+### Allowed Tools
+
+This skill is configured with permissions to:
+- Read files (`Read`, `Grep`, `Glob`)
+- Run Python code (`Bash(python:*)`)
+- Search for files (`Bash(find:*)`)
+- Run tests (`Bash(pytest:*)`)
+
+These tools are automatically permitted when the skill is active, streamlining 
the workflow.
+
+### Reference Materials
+
+The skill includes additional reference files:
+
+- **[examples.md](examples.md)** - Comprehensive code examples for common 
patterns
+  - Basic DAG creation
+  - Advanced function modifiers
+  - LLM & RAG workflows
+  - Feature engineering patterns
+  - Data quality validation
+  - Parallel execution
+  - Integration patterns (Airflow, FastAPI, Streamlit)
+
+## Requirements
+
+- **Claude Code CLI** - Install from https://code.claude.com
+- **Hamilton** - The skill works with any version, but references Hamilton 
1.x+ patterns
+- **Python 3.9+** - For running generated Hamilton code
+
+## Contributing
+
+This skill is open source and part of the Hamilton project! We welcome 
contributions:
+
+### Found a Bug?
+
+Please [file an issue](https://github.com/dagworks-inc/hamilton/issues/new) on 
GitHub with:
+- A clear description of the problem
+- Steps to reproduce
+- Expected vs actual behavior
+- Your Hamilton and Claude Code versions
+
+### Want to Improve It?
+
+Even better - submit a pull request!
+
+1. **Fork the repository**: https://github.com/dagworks-inc/hamilton
+2. **Make your changes**: Edit files in `.claude/skills/hamilton/`
+3. **Test thoroughly**: Try the skill with various Hamilton scenarios
+4. **Submit a PR**: Include a clear description of your improvements
+
+**Types of contributions we love:**
+- 📚 Add new examples to `examples.md`
+- 📝 Improve instructions in `SKILL.md`
+- 🐛 Fix bugs or inaccuracies
+- ✨ Add support for new Hamilton features
+- 📖 Enhance documentation
+
+See [CONTRIBUTING.md](../../../CONTRIBUTING.md) in the Hamilton repo for 
detailed guidelines.
+
+## Philosophy
+
+This skill follows Hamilton's core philosophy:
+
+- **Declarative over imperative**: Guide users toward function-based 
definitions
+- **Separation of concerns**: Keep definition, execution, and observation 
separate
+- **Reusability**: Encourage patterns that make code testable and portable
+- **Simplicity**: Prefer simple solutions over over-engineering
+
+## Learn More
+
+- **Hamilton Documentation**: https://hamilton.dagworks.io
+- **GitHub Repository**: https://github.com/dagworks-inc/hamilton
+- **Hamilton Examples**: See `examples/` directory in the repo (60+ production 
examples)
+- **DAGWorks Blog**: https://blog.dagworks.io
+- **Community Slack**: Join via Hamilton GitHub repo
+
+## Version
+
+This skill is maintained alongside Hamilton and evolves with the framework. 
Last updated: January 2025
+
+## License
+
+This skill is part of the Hamilton project and is licensed under the same 
terms (Apache 2.0 License).
+
+---
+
+**Happy Hamilton coding with Claude! 🚀**
diff --git a/.claude/skills/hamilton/SKILL.md b/.claude/skills/hamilton/SKILL.md
new file mode 100644
index 00000000..22841cc1
--- /dev/null
+++ b/.claude/skills/hamilton/SKILL.md
@@ -0,0 +1,474 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+---
+name: hamilton
+description: Expert assistant for Hamilton DAG development. Use when creating 
Hamilton modules, applying decorators, visualizing DAGs, debugging dataflows, 
converting Python code to Hamilton patterns, or optimizing Hamilton pipelines.
+allowed-tools: Read, Grep, Glob, Bash(python:*), Bash(find:*), Bash(pytest:*)
+user-invocable: true
+disable-model-invocation: false
+---
+
+# Hamilton Development Assistant
+
+Hamilton is a lightweight Python framework for building Directed Acyclic 
Graphs (DAGs) of data transformations using declarative, function-based 
definitions.
+
+## Core Principles
+
+**Function-Based DAG Definition**
+- Functions with type hints define nodes in the DAG
+- Function parameters automatically create edges (dependencies)
+- Function names become node names in the DAG
+- Pure functions enable easy testing and reusability
+
+**Key Architecture Components**
+- **Functions**: Define transformations with parameters as dependencies
+- **Driver**: Builds and manages DAG execution (`.execute()` runs the DAG)
+- **FunctionGraph**: Internal DAG representation
+- **Function Modifiers**: Decorators that modify DAG behavior (see below)
+- **Adapters**: Result formatters and lifecycle hooks
+
+**Separation of Concerns**
+- **Definition layer**: Pure Python functions (testable, reusable)
+- **Execution layer**: Driver configuration (where/how to run)
+- **Observation layer**: Monitoring, lineage, caching
+
+## Common Tasks
+
+### 1. Creating New Hamilton Modules
+
+When creating a new Hamilton module, follow these patterns:
+
+**Basic Module Structure:**
+```python
+"""
+Module docstring explaining the DAG's purpose.
+"""
+import pandas as pd
+from hamilton.function_modifiers import config, parameterize, extract_columns
+
+def raw_data(data_path: str) -> pd.DataFrame:
+    """Load raw data from source.
+
+    :param data_path: Path to data file (passed as input)
+    :return: Raw DataFrame
+    """
+    return pd.read_csv(data_path)
+
+def cleaned_data(raw_data: pd.DataFrame) -> pd.DataFrame:
+    """Remove null values and duplicates.
+
+    :param raw_data: Raw data from previous node
+    :return: Cleaned DataFrame
+    """
+    return raw_data.dropna().drop_duplicates()
+
+def feature_a(cleaned_data: pd.DataFrame) -> pd.Series:
+    """Calculate feature A.
+
+    :param cleaned_data: Cleaned data
+    :return: Feature A values
+    """
+    return cleaned_data['column_a'] * 2
+```
+
+**Driver Setup:**
+```python
+from hamilton import driver
+import my_functions
+
+dr = driver.Driver({}, my_functions)
+results = dr.execute(
+    ['feature_a', 'cleaned_data'],
+    inputs={'data_path': 'data.csv'}
+)
+```
+
+**Best Practices for New Modules:**
+1. ✅ Add type hints to ALL function signatures
+2. ✅ Write clear docstrings with :param and :return
+3. ✅ Keep functions pure (no side effects unless using @does)
+4. ✅ Name functions after the output they produce
+5. ✅ Use function parameters for dependencies (not globals)
+6. ✅ Create unit tests for each function
+7. ❌ Don't use classes unless needed (functions are preferred)
+8. ❌ Don't mutate inputs (return new objects)
+
+### 2. Applying Function Modifiers (Decorators)
+
+Hamilton's power comes from function modifiers. Here's when to use each:
+
+**Configuration & Polymorphism**
+
+```python
+from hamilton.function_modifiers import config
+
[email protected](model_type='linear')
+def predictions(features: pd.DataFrame) -> pd.Series:
+    """Linear model predictions."""
+    from sklearn.linear_model import LinearRegression
+    model = LinearRegression()
+    return model.fit_predict(features)
+
[email protected](model_type='tree')
+def predictions(features: pd.DataFrame) -> pd.Series:
+    """Tree model predictions."""
+    from sklearn.tree import DecisionTreeRegressor
+    model = DecisionTreeRegressor()
+    return model.fit_predict(features)
+
+# Use: driver.Driver({'model_type': 'linear'}, module)
+```
+
+**Parameterization - Creating Multiple Nodes**
+
+```python
+from hamilton.function_modifiers import parameterize
+
+@parameterize(
+    rolling_7d={'window': 7},
+    rolling_30d={'window': 30},
+    rolling_90d={'window': 90},
+)
+def rolling_average(spend: pd.Series, window: int) -> pd.Series:
+    """Calculate rolling average for different windows."""
+    return spend.rolling(window).mean()
+
+# Creates 3 nodes: rolling_7d, rolling_30d, rolling_90d
+```
+
+**Column Extraction - DataFrames to Series**
+
+```python
+from hamilton.function_modifiers import extract_columns
+
+@extract_columns('feature_1', 'feature_2', 'feature_3')
+def features(cleaned_data: pd.DataFrame) -> pd.DataFrame:
+    """Generate multiple features."""
+    return pd.DataFrame({
+        'feature_1': cleaned_data['a'] * 2,
+        'feature_2': cleaned_data['b'] ** 2,
+        'feature_3': cleaned_data['a'] + cleaned_data['b'],
+    })
+
+# Creates 3 nodes: feature_1, feature_2, feature_3 (each a Series)
+```
+
+**Data Quality Validation**
+
+```python
+from hamilton.function_modifiers import check_output
+import pandera as pa
+
+@check_output(
+    data_type=float,
+    range=(0, 100),
+    importance="fail"  # or "warn"
+)
+def revenue_percentage(revenue: float, total: float) -> float:
+    """Calculate revenue as percentage."""
+    return (revenue / total) * 100
+
+# With Pandera schemas
+@check_output(
+    schema=pa.SeriesSchema(float, pa.Check.greater_than(0)),
+    importance="fail"
+)
+def positive_values(data: pd.Series) -> pd.Series:
+    """Ensure all values are positive."""
+    return data
+```
+
+**I/O Materialization**
+
+```python
+from hamilton.function_modifiers import save_to, load_from
+from hamilton.io.materialization import to
+
+@save_to(to.csv(path="output.csv"))
+def final_results(aggregated_data: pd.DataFrame) -> pd.DataFrame:
+    """Save final results to CSV."""
+    return aggregated_data
+
+@load_from(from_='data.parquet', reader='parquet')
+def input_data() -> pd.DataFrame:
+    """Load data from parquet."""
+    pass  # Function body ignored when using @load_from
+```
+
+**Transformation Macros**
+
+```python
+from hamilton.function_modifiers import pipe
+
+@pipe(
+    step1=lambda df: df.dropna(),
+    step2=lambda df: df[df['value'] > 0],
+    step3=lambda df: df.reset_index(drop=True)
+)
+def cleaned_pipeline(raw_data: pd.DataFrame) -> pd.DataFrame:
+    """Apply sequential transformations."""
+    return raw_data
+```
+
+### 3. Converting Existing Code to Hamilton
+
+**Before (Script):**
+```python
+import pandas as pd
+
+df = pd.read_csv('data.csv')
+df = df.dropna()
+df['feature'] = df['col_a'] * 2
+result = df.groupby('category')['feature'].mean()
+print(result)
+```
+
+**After (Hamilton Module):**
+```python
+"""Data processing DAG."""
+import pandas as pd
+
+def raw_data(data_path: str) -> pd.DataFrame:
+    """Load raw data."""
+    return pd.read_csv(data_path)
+
+def cleaned_data(raw_data: pd.DataFrame) -> pd.DataFrame:
+    """Remove nulls."""
+    return raw_data.dropna()
+
+def feature(cleaned_data: pd.DataFrame) -> pd.Series:
+    """Calculate feature."""
+    return cleaned_data['col_a'] * 2
+
+def data_with_feature(cleaned_data: pd.DataFrame, feature: pd.Series) -> 
pd.DataFrame:
+    """Add feature to dataset."""
+    df = cleaned_data.copy()
+    df['feature'] = feature
+    return df
+
+def result(data_with_feature: pd.DataFrame) -> pd.Series:
+    """Aggregate by category."""
+    return data_with_feature.groupby('category')['feature'].mean()
+```
+
+**Conversion Guidelines:**
+1. Identify distinct computation steps
+2. Extract each step into a pure function
+3. Use previous step's variable name as function parameter
+4. Add type hints and docstrings
+5. Remove imperative variable assignments
+6. Test each function independently
+
+### 4. Visualizing & Understanding DAGs
+
+**Generate Visualization:**
+```python
+from hamilton import driver
+import my_functions
+
+dr = driver.Driver({}, my_functions)
+
+# Create visualization
+dr.display_all_functions('dag.png')  # All nodes
+dr.visualize_execution(
+    ['final_output'],
+    'execution.png',
+    inputs={'input_data': ...}
+)  # Execution path only
+```
+
+**Understanding DAG Structure:**
+- Each function becomes a node
+- Function parameters create directed edges
+- No cycles allowed (DAG = Directed Acyclic Graph)
+- Execution order determined by dependencies
+- Multiple paths execute in parallel when possible
+
+**Debugging Tips:**
+1. Check for circular dependencies (A depends on B depends on A)
+2. Verify all parameter names match existing function names
+3. Look for typos in parameter names
+4. Use `dr.list_available_variables()` to see all nodes
+5. Check `dr.what_is_downstream_of('node_name')` for dependencies
+
+### 5. Testing Hamilton Functions
+
+**Unit Testing Pattern:**
+```python
+import pytest
+import pandas as pd
+from my_functions import cleaned_data, feature
+
+def test_cleaned_data():
+    """Test data cleaning."""
+    raw = pd.DataFrame({
+        'col_a': [1, 2, None, 4],
+        'col_b': ['a', 'b', 'c', 'd']
+    })
+    result = cleaned_data(raw)
+    assert len(result) == 3
+    assert result['col_a'].isna().sum() == 0
+
+def test_feature():
+    """Test feature calculation."""
+    data = pd.DataFrame({'col_a': [1, 2, 3]})
+    result = feature(data)
+    pd.testing.assert_series_equal(
+        result,
+        pd.Series([2, 4, 6], name='col_a')
+    )
+```
+
+**Integration Testing with Driver:**
+```python
+def test_full_pipeline():
+    """Test complete DAG execution."""
+    from hamilton import driver
+    import my_functions
+
+    dr = driver.Driver({}, my_functions)
+    result = dr.execute(
+        ['result'],
+        inputs={'data_path': 'test_data.csv'}
+    )
+    assert 'result' in result
+    assert result['result'].sum() > 0
+```
+
+### 6. Common Patterns & Examples
+
+**Feature Engineering:**
+```python
+from hamilton.function_modifiers import extract_columns
+
+@extract_columns('mean_price', 'max_price', 'min_price', 'std_price')
+def price_features(prices: pd.Series) -> pd.DataFrame:
+    """Calculate price statistics."""
+    return pd.DataFrame({
+        'mean_price': [prices.mean()],
+        'max_price': [prices.max()],
+        'min_price': [prices.min()],
+        'std_price': [prices.std()],
+    })
+```
+
+**LLM/RAG Workflows:**
+```python
+from typing import List
+
+def chunked_documents(raw_text: str, chunk_size: int = 500) -> List[str]:
+    """Split documents into chunks."""
+    words = raw_text.split()
+    return [
+        ' '.join(words[i:i+chunk_size])
+        for i in range(0, len(words), chunk_size)
+    ]
+
+def embeddings(chunked_documents: List[str], model: str = 
'text-embedding-ada-002') -> List[List[float]]:
+    """Generate embeddings for chunks."""
+    import openai
+    response = openai.Embedding.create(input=chunked_documents, model=model)
+    return [item['embedding'] for item in response['data']]
+```
+
+**Data Quality Pipeline:**
+```python
+from hamilton.function_modifiers import check_output
+import pandera as pa
+
+schema = pa.DataFrameSchema({
+    'user_id': pa.Column(int, pa.Check.greater_than(0)),
+    'amount': pa.Column(float, pa.Check.in_range(0, 10000)),
+    'date': pa.Column(pa.DateTime),
+})
+
+@check_output(schema=schema, importance="fail")
+def validated_transactions(raw_transactions: pd.DataFrame) -> pd.DataFrame:
+    """Validate transaction data."""
+    return raw_transactions
+```
+
+## Key Files & Locations
+
+- **Core library**: `hamilton/` - Main package code
+- **Driver**: `hamilton/driver.py` - Main orchestration class
+- **Function modifiers**: `hamilton/function_modifiers/` - Decorators
+- **Plugins**: `hamilton/plugins/` - 50+ integrations
+- **Examples**: `examples/` - 60+ production examples
+- **Tests**: `tests/` - Unit and integration tests
+- **Docs**: `docs/` - Official documentation
+
+## Common Pitfalls & Solutions
+
+**Circular Dependencies:**
+```python
+# ❌ Bad - circular dependency
+def a(b: int) -> int:
+    return b + 1
+
+def b(a: int) -> int:
+    return a + 1
+
+# ✅ Good - break the cycle
+def a(input_value: int) -> int:
+    return input_value + 1
+
+def b(a: int) -> int:
+    return a + 1
+```
+
+**Missing Type Hints:**
+```python
+# ❌ Bad - no type hints
+def process(data):
+    return data * 2
+
+# ✅ Good - clear types
+def process(data: pd.Series) -> pd.Series:
+    return data * 2
+```
+
+**Mutating Inputs:**
+```python
+# ❌ Bad - mutates input
+def add_column(df: pd.DataFrame, col_name: str) -> pd.DataFrame:
+    df[col_name] = 0  # Modifies original!
+    return df
+
+# ✅ Good - returns new object
+def add_column(df: pd.DataFrame, col_name: str) -> pd.DataFrame:
+    result = df.copy()
+    result[col_name] = 0
+    return result
+```
+
+## Getting Help
+
+- **Documentation**: `docs/` directory in repo
+- **Examples**: `examples/` directory for patterns
+- **Community**: Hamilton Slack, GitHub issues
+- **Blog**: blog.dagworks.io for deep dives
+
+## Additional Resources
+
+For detailed reference material, see:
+- [examples.md](examples.md) - Curated example patterns
+- Hamilton official docs at docs.hamilton.dagworks.io
+- Hamilton GitHub repository examples folder
diff --git a/.claude/skills/hamilton/examples.md 
b/.claude/skills/hamilton/examples.md
new file mode 100644
index 00000000..828729d1
--- /dev/null
+++ b/.claude/skills/hamilton/examples.md
@@ -0,0 +1,610 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Hamilton Code Examples & Patterns
+
+This document provides comprehensive examples for common Hamilton patterns.
+
+## Table of Contents
+
+1. [Basic DAG Creation](#basic-dag-creation)
+2. [Advanced Function Modifiers](#advanced-function-modifiers)
+3. [LLM & RAG Workflows](#llm--rag-workflows)
+4. [Feature Engineering](#feature-engineering)
+5. [Data Quality & Validation](#data-quality--validation)
+6. [Parallel Execution](#parallel-execution)
+7. [Caching Strategies](#caching-strategies)
+8. [Integration Patterns](#integration-patterns)
+
+---
+
+## Basic DAG Creation
+
+### Hello World Example
+
+**File: `hello_world_functions.py`**
+```python
+"""Simple Hamilton DAG for data processing."""
+import pandas as pd
+
+def raw_data(csv_path: str) -> pd.DataFrame:
+    """Load data from CSV file.
+
+    :param csv_path: Path to CSV file (external input)
+    """
+    return pd.read_csv(csv_path)
+
+def avg_age(raw_data: pd.DataFrame) -> float:
+    """Calculate average age.
+
+    :param raw_data: Raw data DataFrame
+    """
+    return raw_data['age'].mean()
+
+def filtered_data(raw_data: pd.DataFrame, min_age: int = 18) -> pd.DataFrame:
+    """Filter data by minimum age.
+
+    :param raw_data: Raw data DataFrame
+    :param min_age: Minimum age threshold (config/input)
+    """
+    return raw_data[raw_data['age'] >= min_age]
+```
+
+**File: `run.py`**
+```python
+from hamilton import driver
+import hello_world_functions
+
+# Create driver
+dr = driver.Driver(
+    config={'min_age': 21},  # Config values
+    module=hello_world_functions
+)
+
+# Execute DAG
+results = dr.execute(
+    final_vars=['avg_age', 'filtered_data'],
+    inputs={'csv_path': 'data.csv'}  # External inputs
+)
+
+print(f"Average age: {results['avg_age']}")
+print(f"Filtered rows: {len(results['filtered_data'])}")
+```
+
+---
+
+## Advanced Function Modifiers
+
+### Parameterization Patterns
+
+**Multiple Time Windows:**
+```python
+from hamilton.function_modifiers import parameterize
+
+@parameterize(
+    daily_sum={'window': '1D', 'agg': 'sum'},
+    weekly_avg={'window': '7D', 'agg': 'mean'},
+    monthly_max={'window': '30D', 'agg': 'max'},
+)
+def time_aggregation(
+    sales: pd.Series,
+    window: str,
+    agg: str
+) -> pd.Series:
+    """Aggregate sales over different time windows.
+
+    Creates nodes: daily_sum, weekly_avg, monthly_max
+    """
+    return sales.rolling(window).agg(agg)
+```
+
+**Multiple Data Sources:**
+```python
+from hamilton.function_modifiers import parameterize_sources
+
+@parameterize_sources(
+    revenue_per_user=dict(metric='revenue', denominator='users'),
+    cost_per_signup=dict(metric='cost', denominator='signups'),
+    profit_per_order=dict(metric='profit', denominator='orders'),
+)
+def rate_calculation(metric: pd.Series, denominator: pd.Series) -> pd.Series:
+    """Calculate various rate metrics."""
+    return metric / denominator
+```
+
+### Configuration-Based Polymorphism
+
+**Model Selection:**
+```python
+from hamilton.function_modifiers import config
+from sklearn.linear_model import LinearRegression, Ridge, Lasso
+
[email protected](model='linear')
+def trained_model__linear(X_train: pd.DataFrame, y_train: pd.Series) -> 
LinearRegression:
+    """Train linear regression model."""
+    model = LinearRegression()
+    model.fit(X_train, y_train)
+    return model
+
[email protected](model='ridge')
+def trained_model__ridge(X_train: pd.DataFrame, y_train: pd.Series) -> Ridge:
+    """Train ridge regression model."""
+    model = Ridge(alpha=1.0)
+    model.fit(X_train, y_train)
+    return model
+
[email protected](model='lasso')
+def trained_model__lasso(X_train: pd.DataFrame, y_train: pd.Series) -> Lasso:
+    """Train lasso regression model."""
+    model = Lasso(alpha=0.1)
+    model.fit(X_train, y_train)
+    return model
+
+# All create same node: trained_model
+# Selected by config: driver.Driver({'model': 'ridge'}, module)
+```
+
+**Environment-Specific Behavior:**
+```python
[email protected](environment='development')
+def data_source__dev() -> str:
+    """Development database connection."""
+    return "sqlite:///dev.db"
+
[email protected](environment='production')
+def data_source__prod() -> str:
+    """Production database connection."""
+    return "postgresql://prod-server/db"
+```
+
+### Column Extraction
+
+**Feature Generation:**
+```python
+from hamilton.function_modifiers import extract_columns
+
+@extract_columns('price_mean', 'price_median', 'price_std', 'price_min', 
'price_max')
+def price_statistics(transactions: pd.DataFrame) -> pd.DataFrame:
+    """Generate price statistics.
+
+    Creates 5 separate nodes, each containing a single series.
+    """
+    prices = transactions['price']
+    return pd.DataFrame({
+        'price_mean': [prices.mean()],
+        'price_median': [prices.median()],
+        'price_std': [prices.std()],
+        'price_min': [prices.min()],
+        'price_max': [prices.max()],
+    })
+```
+
+---
+
+## LLM & RAG Workflows
+
+### Document Processing Pipeline
+
+```python
+from typing import List, Dict
+import openai
+from hamilton.function_modifiers import parameterize, extract_columns
+
+def document_text(pdf_path: str) -> str:
+    """Extract text from PDF."""
+    import PyPDF2
+    with open(pdf_path, 'rb') as f:
+        reader = PyPDF2.PdfReader(f)
+        return '\n'.join(page.extract_text() for page in reader.pages)
+
+def chunks(document_text: str, chunk_size: int = 1000, overlap: int = 100) -> 
List[str]:
+    """Split document into overlapping chunks."""
+    chunks = []
+    start = 0
+    while start < len(document_text):
+        end = start + chunk_size
+        chunks.append(document_text[start:end])
+        start = end - overlap
+    return chunks
+
+def embeddings(chunks: List[str], embedding_model: str = 
'text-embedding-ada-002') -> List[List[float]]:
+    """Generate embeddings for chunks."""
+    response = openai.Embedding.create(
+        input=chunks,
+        model=embedding_model
+    )
+    return [item['embedding'] for item in response['data']]
+
+def vector_store(chunks: List[str], embeddings: List[List[float]]) -> Dict:
+    """Store chunks with embeddings in vector database."""
+    import pinecone
+    index = pinecone.Index('documents')
+
+    vectors = [
+        (f"chunk_{i}", emb, {"text": chunk})
+        for i, (chunk, emb) in enumerate(zip(chunks, embeddings))
+    ]
+    index.upsert(vectors)
+    return {"status": "success", "count": len(vectors)}
+
+def query_results(
+    query: str,
+    embedding_model: str,
+    vector_store: Dict,
+    top_k: int = 5
+) -> List[str]:
+    """Retrieve relevant chunks for query."""
+    import pinecone
+
+    # Generate query embedding
+    query_emb = openai.Embedding.create(
+        input=[query],
+        model=embedding_model
+    )['data'][0]['embedding']
+
+    # Search vector store
+    index = pinecone.Index('documents')
+    results = index.query(query_emb, top_k=top_k, include_metadata=True)
+
+    return [match['metadata']['text'] for match in results['matches']]
+
+def llm_response(query: str, query_results: List[str]) -> str:
+    """Generate response using retrieved context."""
+    context = "\n\n".join(query_results)
+    prompt = f"""Answer the question based on the context below.
+
+Context:
+{context}
+
+Question: {query}
+
+Answer:"""
+
+    response = openai.ChatCompletion.create(
+        model="gpt-4",
+        messages=[{"role": "user", "content": prompt}]
+    )
+    return response.choices[0].message.content
+```
+
+### Multi-Provider Vector Database Pattern
+
+```python
+from hamilton.function_modifiers import config
+
[email protected](vector_db='pinecone')
+def vector_store__pinecone(embeddings: List[List[float]], chunks: List[str]) 
-> object:
+    """Store in Pinecone."""
+    import pinecone
+    index = pinecone.Index('docs')
+    index.upsert([(f"c{i}", e, {"text": c}) for i, (e, c) in 
enumerate(zip(embeddings, chunks))])
+    return index
+
[email protected](vector_db='weaviate')
+def vector_store__weaviate(embeddings: List[List[float]], chunks: List[str]) 
-> object:
+    """Store in Weaviate."""
+    import weaviate
+    client = weaviate.Client("http://localhost:8080";)
+    # Weaviate-specific logic
+    return client
+
[email protected](vector_db='lancedb')
+def vector_store__lancedb(embeddings: List[List[float]], chunks: List[str]) -> 
object:
+    """Store in LanceDB."""
+    import lancedb
+    db = lancedb.connect("./lancedb")
+    # LanceDB-specific logic
+    return db
+```
+
+---
+
+## Feature Engineering
+
+### Time Series Features
+
+```python
+import pandas as pd
+from hamilton.function_modifiers import parameterize
+
+def timestamps(data: pd.DataFrame) -> pd.Series:
+    """Convert to datetime."""
+    return pd.to_datetime(data['timestamp'])
+
+@parameterize(
+    hour_of_day={'freq': 'hour'},
+    day_of_week={'freq': 'dayofweek'},
+    month_of_year={'freq': 'month'},
+    is_weekend={'freq': 'dayofweek', 'transform': lambda x: x >= 5}
+)
+def time_features(timestamps: pd.Series, freq: str, transform=None) -> 
pd.Series:
+    """Extract time-based features."""
+    result = getattr(timestamps.dt, freq)
+    return transform(result) if transform else result
+
+def rolling_mean_7d(values: pd.Series) -> pd.Series:
+    """7-day rolling average."""
+    return values.rolling(window=7, min_periods=1).mean()
+
+def rolling_std_7d(values: pd.Series) -> pd.Series:
+    """7-day rolling standard deviation."""
+    return values.rolling(window=7, min_periods=1).std()
+
+def lag_1d(values: pd.Series) -> pd.Series:
+    """Previous day's value."""
+    return values.shift(1)
+
+def diff_1d(values: pd.Series) -> pd.Series:
+    """Day-over-day difference."""
+    return values.diff(1)
+```
+
+### Categorical Feature Engineering
+
+```python
+from sklearn.preprocessing import LabelEncoder, OneHotEncoder
+import pandas as pd
+
+def label_encoded_category(category_column: pd.Series) -> pd.Series:
+    """Label encode categorical variable."""
+    encoder = LabelEncoder()
+    return pd.Series(encoder.fit_transform(category_column))
+
+def one_hot_encoded_categories(category_column: pd.Series) -> pd.DataFrame:
+    """One-hot encode categorical variable."""
+    return pd.get_dummies(category_column, prefix='cat')
+
+def frequency_encoded_category(category_column: pd.Series) -> pd.Series:
+    """Frequency encoding for high-cardinality categories."""
+    freq_map = category_column.value_counts(normalize=True).to_dict()
+    return category_column.map(freq_map)
+```
+
+---
+
+## Data Quality & Validation
+
+### Schema Validation with Pandera
+
+```python
+from hamilton.function_modifiers import check_output
+import pandera as pa
+
+# Define schema
+transaction_schema = pa.DataFrameSchema({
+    'transaction_id': pa.Column(str, pa.Check.str_length(min_value=10, 
max_value=20)),
+    'user_id': pa.Column(int, pa.Check.greater_than(0)),
+    'amount': pa.Column(float, pa.Check.in_range(0, 100000)),
+    'timestamp': pa.Column(pa.DateTime),
+    'status': pa.Column(str, pa.Check.isin(['pending', 'completed', 
'failed'])),
+})
+
+@check_output(schema=transaction_schema, importance="fail")
+def validated_transactions(raw_transactions: pd.DataFrame) -> pd.DataFrame:
+    """Validate transaction data schema."""
+    return raw_transactions
+
+# Custom validators
+@check_output(
+    data_type=float,
+    range=(0, 1),
+    allow_nan=False,
+    importance="fail"
+)
+def probability_score(model_output: pd.Series) -> pd.Series:
+    """Ensure scores are valid probabilities."""
+    return model_output
+```
+
+### Custom Data Quality Checks
+
+```python
+from hamilton.function_modifiers import check_output_custom
+
+def no_duplicates_check(df: pd.DataFrame) -> bool:
+    """Check for duplicate rows."""
+    return not df.duplicated().any()
+
+def required_columns_check(df: pd.DataFrame, required: list) -> bool:
+    """Check all required columns exist."""
+    return all(col in df.columns for col in required)
+
+@check_output_custom(
+    no_duplicates_check,
+    lambda df: required_columns_check(df, ['id', 'name', 'value']),
+    importance="warn"
+)
+def cleaned_data(raw_data: pd.DataFrame) -> pd.DataFrame:
+    """Clean data with quality checks."""
+    return raw_data.drop_duplicates()
+```
+
+---
+
+## Parallel Execution
+
+### ThreadPoolExecutor Pattern
+
+```python
+from hamilton import driver
+from hamilton.execution import executors
+import my_functions
+
+# Create driver with parallel executor
+dr = driver.Builder()\
+    .with_modules(my_functions)\
+    .with_local_executor(executors.MultiThreadingExecutor(max_tasks=4))\
+    .build()
+
+results = dr.execute(['output1', 'output2'])
+```
+
+### Ray Distributed Execution
+
+```python
+from hamilton.plugins import h_ray
+import ray
+
+ray.init()
+
+# Use Ray executor for distributed processing
+ray_executor = h_ray.RayGraphAdapter(result_builder={"base": dict})
+
+dr = driver.Driver(
+    {},
+    my_functions,
+    adapter=ray_executor
+)
+
+results = dr.execute(['large_computation'], inputs={...})
+```
+
+---
+
+## Caching Strategies
+
+### Function-Level Caching
+
+```python
+from hamilton.function_modifiers import cache
+
+@cache(path="./cache")
+def expensive_computation(large_dataset: pd.DataFrame) -> pd.DataFrame:
+    """Cache results to disk."""
+    # Expensive operation
+    return large_dataset.apply(complex_transformation)
+```
+
+### Disk-Based Caching with DuckDB
+
+```python
+from hamilton.function_modifiers import save_to, load_from
+from hamilton.io.materialization import to
+
+@save_to(to.parquet(path="intermediate_results.parquet"))
+def intermediate_step(raw_data: pd.DataFrame) -> pd.DataFrame:
+    """Save intermediate results."""
+    return raw_data.groupby('category').sum()
+
+# Later in pipeline
+@load_from(from_='intermediate_results.parquet', reader='parquet')
+def cached_intermediate() -> pd.DataFrame:
+    """Load cached intermediate results."""
+    pass
+```
+
+---
+
+## Integration Patterns
+
+### Airflow Integration
+
+```python
+from airflow import DAG
+from airflow.operators.python import PythonOperator
+from hamilton import driver
+import my_hamilton_functions
+
+def run_hamilton_dag(**context):
+    """Execute Hamilton DAG within Airflow task."""
+    dr = driver.Driver({}, my_hamilton_functions)
+    results = dr.execute(['final_output'], inputs=context['params'])
+    return results
+
+with DAG('hamilton_pipeline', schedule_interval='@daily') as dag:
+    task = PythonOperator(
+        task_id='run_hamilton',
+        python_callable=run_hamilton_dag,
+        params={'data_date': '{{ ds }}'}
+    )
+```
+
+### FastAPI Integration
+
+```python
+from fastapi import FastAPI
+from hamilton import driver
+import prediction_functions
+
+app = FastAPI()
+dr = driver.Driver({}, prediction_functions)
+
[email protected]("/predict")
+def predict(features: dict):
+    """Stateless prediction endpoint."""
+    result = dr.execute(['prediction'], inputs=features)
+    return {"prediction": result['prediction']}
+```
+
+### Streamlit Dashboard
+
+```python
+import streamlit as st
+from hamilton import driver
+import dashboard_functions
+
+st.title("Data Dashboard")
+
+# User inputs
+date_range = st.date_input("Select date range", [])
+metric = st.selectbox("Metric", ["revenue", "users", "conversions"])
+
+# Execute Hamilton DAG
+dr = driver.Driver({'metric': metric}, dashboard_functions)
+results = dr.execute(['visualization_data'], inputs={'date_range': date_range})
+
+# Display results
+st.line_chart(results['visualization_data'])
+```
+
+---
+
+## Advanced Patterns
+
+### Subdag Pattern
+
+```python
+from hamilton.function_modifiers import subdag
+
+@subdag(
+    module=feature_engineering,
+    outputs={'scaled_features': 'features'},
+    config={'scaler': 'standard'}
+)
+def processed_features(input_data: pd.DataFrame) -> pd.DataFrame:
+    """Apply feature engineering subdag."""
+    pass
+```
+
+### Dynamic Execution
+
+```python
+from hamilton import driver
+import dynamic_functions
+
+dr = driver.Driver({}, dynamic_functions)
+
+# Execute with runtime parameters
+results = dr.execute(
+    ['final_output'],
+    inputs={'param_list': ['a', 'b', 'c']},  # Expands at runtime
+)
+```
+
+This covers the most common Hamilton patterns. Refer to the main SKILL.md for 
additional guidance and the `examples/` directory in the Hamilton repository 
for more production examples.
diff --git a/docs/ecosystem/claude-code-plugin.md 
b/docs/ecosystem/claude-code-plugin.md
new file mode 100644
index 00000000..91f60435
--- /dev/null
+++ b/docs/ecosystem/claude-code-plugin.md
@@ -0,0 +1,406 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Claude Code Plugin for Hamilton
+
+The Hamilton Claude Code plugin provides AI-powered assistance for developing 
Hamilton DAGs using [Claude Code](https://code.claude.com), Anthropic's 
official CLI tool.
+
+## What is Claude Code?
+
+[Claude Code](https://code.claude.com) is an AI-powered CLI tool that helps 
you build software faster by providing intelligent code assistance, debugging 
help, and code generation capabilities. The Hamilton plugin extends Claude Code 
with deep knowledge of Hamilton patterns and best practices.
+
+## Features
+
+The Hamilton Claude Code plugin provides expert assistance for:
+
+- 🏗️ **Creating new Hamilton modules** with proper patterns and decorators
+- 🔍 **Understanding existing DAGs** by explaining dataflow and dependencies
+- 🎨 **Applying function modifiers** correctly (@parameterize, @config.when, 
@check_output, etc.)
+- 🐛 **Debugging issues** in DAG definitions and execution
+- 🔄 **Converting Python scripts** to Hamilton modules
+- ⚡ **Optimizing pipelines** with caching, parallelization, and best practices
+- ✅ **Writing tests** for Hamilton functions
+- 📊 **Generating visualizations** of your DAGs
+
+## Installation
+
+### Prerequisites
+
+First, install Claude Code:
+
+```bash
+# Install Claude Code CLI
+curl -fsSL https://cli.claude.ai/install.sh | sh
+```
+
+For more installation options, see the [Claude Code 
documentation](https://code.claude.com/docs/en/install.html).
+
+### Install the Hamilton Plugin
+
+Once Claude Code is installed, you can add the Hamilton plugin:
+
+```bash
+# Add the Hamilton plugin marketplace
+/plugin marketplace add dagworks-inc/hamilton
+
+# Install the plugin (available across all projects)
+/plugin install hamilton --scope user
+```
+
+Or combine into a single command:
+```bash
+claude plugin install hamilton@dagworks-inc/hamilton --scope user
+```
+
+**Installation scopes:**
+- `--scope user` - Available in all your projects (recommended)
+- `--scope project` - Only in the current project
+- `--scope local` - Testing/development only
+
+### For Contributors
+
+If you've cloned the Hamilton repository, the skill is already available in 
`.claude/skills/hamilton/` and will be automatically discovered by Claude Code 
when you work in the repo. No installation needed!
+
+## Usage
+
+### Automatic Assistance
+
+Claude Code will automatically use the Hamilton plugin when it detects you're 
working with Hamilton code. Just ask questions or give instructions naturally:
+
+```bash
+claude "Help me create a Hamilton module for processing customer data"
+claude "Explain what this DAG does"
+claude "Convert this pandas script to Hamilton"
+claude "Add caching to my expensive computation function"
+claude "Why am I getting a circular dependency error?"
+```
+
+### Manual Invocation
+
+You can explicitly invoke the skill using the `/hamilton` command:
+
+```bash
+claude /hamilton create a feature engineering module with rolling averages
+claude /hamilton explain the dataflow in my_functions.py
+claude /hamilton optimize this DAG for parallel execution
+```
+
+### Interactive Mode
+
+For ongoing assistance, start an interactive session:
+
+```bash
+# Start Claude Code in your Hamilton project
+cd /path/to/your/hamilton/project
+claude
+```
+
+Then interact naturally:
+```
+You: I need to create a DAG that processes customer transactions
+Claude: I'll help you create a Hamilton module for processing customer 
transactions...
+
+You: Add validation to ensure all amounts are positive
+Claude: I'll add a @check_output decorator with validation...
+```
+
+## Examples
+
+### Creating a New Hamilton Module
+
+**Prompt:**
+```
+Create a Hamilton module that loads sales data from a CSV, removes nulls,
+calculates a 7-day rolling average, and outputs the top 10 days by sales.
+```
+
+**Claude generates:**
+```python
+"""Sales analysis DAG."""
+import pandas as pd
+
+def raw_sales(csv_path: str) -> pd.DataFrame:
+    """Load sales data from CSV.
+
+    :param csv_path: Path to sales CSV file
+    :return: Raw sales DataFrame
+    """
+    return pd.read_csv(csv_path)
+
+def cleaned_sales(raw_sales: pd.DataFrame) -> pd.DataFrame:
+    """Remove null values from sales data.
+
+    :param raw_sales: Raw sales data
+    :return: Cleaned sales DataFrame
+    """
+    return raw_sales.dropna()
+
+def rolling_avg_7d(cleaned_sales: pd.DataFrame) -> pd.Series:
+    """Calculate 7-day rolling average of sales.
+
+    :param cleaned_sales: Cleaned sales data
+    :return: 7-day rolling average
+    """
+    return cleaned_sales['sales'].rolling(window=7, min_periods=1).mean()
+
+def top_10_sales_days(
+    cleaned_sales: pd.DataFrame,
+    rolling_avg_7d: pd.Series
+) -> pd.DataFrame:
+    """Get top 10 days by rolling average.
+
+    :param cleaned_sales: Cleaned sales data
+    :param rolling_avg_7d: 7-day rolling average
+    :return: Top 10 sales days
+    """
+    result = cleaned_sales.copy()
+    result['rolling_avg'] = rolling_avg_7d
+    return result.nlargest(10, 'rolling_avg')
+
+# Usage:
+# from hamilton import driver
+# dr = driver.Driver({}, __name__)
+# results = dr.execute(['top_10_sales_days'], inputs={'csv_path': 'sales.csv'})
+```
+
+### Converting Existing Code
+
+**Prompt:**
+```
+Convert this script to Hamilton:
+
+import pandas as pd
+df = pd.read_csv('data.csv')
+df['feature'] = df['col_a'] * 2 + df['col_b']
+result = df.groupby('category')['feature'].mean()
+```
+
+**Claude refactors to:**
+```python
+"""Data processing DAG."""
+import pandas as pd
+
+def raw_data(data_path: str) -> pd.DataFrame:
+    """Load raw data from CSV."""
+    return pd.read_csv(data_path)
+
+def feature(raw_data: pd.DataFrame) -> pd.Series:
+    """Calculate feature from columns."""
+    return raw_data['col_a'] * 2 + raw_data['col_b']
+
+def data_with_feature(raw_data: pd.DataFrame, feature: pd.Series) -> 
pd.DataFrame:
+    """Add feature column to dataset."""
+    result = raw_data.copy()
+    result['feature'] = feature
+    return result
+
+def result(data_with_feature: pd.DataFrame) -> pd.Series:
+    """Calculate mean feature by category."""
+    return data_with_feature.groupby('category')['feature'].mean()
+```
+
+### Applying Decorators
+
+**Prompt:**
+```
+I need to create rolling averages for 7, 30, and 90 day windows.
+How do I do this without repeating code?
+```
+
+**Claude suggests:**
+```python
+from hamilton.function_modifiers import parameterize
+
+@parameterize(
+    rolling_7d={'window': 7},
+    rolling_30d={'window': 30},
+    rolling_90d={'window': 90},
+)
+def rolling_average(sales: pd.Series, window: int) -> pd.Series:
+    """Calculate rolling average for different windows.
+
+    Creates three nodes: rolling_7d, rolling_30d, rolling_90d
+    """
+    return sales.rolling(window=window, min_periods=1).mean()
+```
+
+### Debugging
+
+**Prompt:**
+```
+I'm getting: "Could not find parameter 'processed_data' in graph"
+```
+
+**Claude analyzes:**
+```
+This error occurs when a function parameter references a node that doesn't 
exist
+in your DAG. Common causes:
+
+1. Typo in parameter name (e.g., 'processed_data' vs 'process_data')
+2. Missing function definition for 'processed_data'
+3. Function excluded via @hamilton_exclude or @config.when
+
+Check your module for a function named 'processed_data'. If it doesn't exist,
+either:
+- Add the function, or
+- Fix the parameter name to match an existing function
+```
+
+## What the Plugin Knows
+
+The plugin has deep knowledge of Hamilton:
+
+### Core Concepts
+- Function-based DAG definitions
+- Driver configuration and execution
+- Node dependencies via parameters
+- Type hints and annotations
+
+### Function Modifiers
+- **Configuration**: @config.when, @hamilton_exclude
+- **Parameterization**: @parameterize, @parameterize_sources, 
@parameterize_values
+- **Column extraction**: @extract_columns, @extract_fields
+- **Data quality**: @check_output, @check_output_custom, @schema
+- **I/O**: @save_to, @load_from, @dataloader, @datasaver
+- **Transformation**: @pipe, @does, @mutate, @step
+- **Advanced**: @subdag, @resolve, @cache
+
+### Integration Patterns
+- Airflow orchestration
+- FastAPI microservices
+- Streamlit dashboards
+- Jupyter notebooks
+- Ray/Dask/Spark distributed execution
+
+### LLM Workflows
+- RAG pipeline patterns
+- Document chunking
+- Vector database operations
+- Embedding generation
+
+### Best Practices
+- Testing strategies
+- Code organization
+- Error handling
+- Performance optimization
+
+## Plugin Structure
+
+The plugin is organized as follows:
+
+```
+.claude/plugins/hamilton/
+├── .claude-plugin/
+│   ├── plugin.json           # Plugin manifest
+│   └── marketplace.json      # Marketplace configuration
+├── skills/
+│   └── hamilton/
+│       ├── SKILL.md          # Main skill instructions
+│       ├── examples.md       # Code examples and patterns
+│       └── README.md         # Skill documentation
+└── README.md                 # Plugin documentation
+```
+
+For contributors, a copy exists in `.claude/skills/hamilton/` for immediate 
use.
+
+## Contributing
+
+Found a bug or want to improve the plugin? We'd love your help!
+
+### Report Issues
+
+Please [file an issue](https://github.com/apache/hamilton/issues/new) with:
+- Clear description of the problem
+- Steps to reproduce
+- Expected vs actual behavior
+- Hamilton and Claude Code versions
+
+### Submit Pull Requests
+
+1. Fork the repository: https://github.com/apache/hamilton
+2. Make changes in `.claude/skills/hamilton/` or `.claude/plugins/hamilton/`
+3. Test thoroughly with various scenarios
+4. Submit a PR with a clear description
+
+**Contribution ideas:**
+- 📚 Add new examples to `examples.md`
+- 📝 Improve instructions in `SKILL.md`
+- 🐛 Fix bugs or inaccuracies
+- ✨ Add support for new Hamilton features
+- 📖 Enhance documentation
+
+See 
[CONTRIBUTING.md](https://github.com/apache/hamilton/blob/main/CONTRIBUTING.md) 
for guidelines.
+
+## Requirements
+
+- **Claude Code CLI** - v0.1.0 or later
+- **Hamilton** - v1.0.0 or later (plugin works with any version)
+- **Python** - 3.9 or later
+
+## Troubleshooting
+
+### Plugin Not Loading
+
+If the plugin isn't recognized:
+
+```bash
+# Check installed plugins
+claude plugin list
+
+# Reinstall if needed
+claude plugin uninstall hamilton
+claude plugin install hamilton@dagworks-inc/hamilton --scope user
+```
+
+### Skill Not Activating
+
+If Claude doesn't seem to use Hamilton knowledge:
+
+```bash
+# Manually invoke the skill
+claude /hamilton <your-question>
+
+# Or mention Hamilton explicitly in your prompt
+claude "Using Hamilton framework, create a DAG for..."
+```
+
+### Permission Errors
+
+The plugin requests permission to:
+- Read files (Read, Grep, Glob)
+- Run Python code (python, pytest)
+- Search files (find)
+
+If prompted, approve these permissions for the best experience.
+
+## Learn More
+
+- **Hamilton Documentation**: https://hamilton.dagworks.io
+- **Claude Code Documentation**: https://code.claude.com/docs
+- **Hamilton GitHub**: https://github.com/apache/hamilton
+- **Hamilton Examples**: https://github.com/apache/hamilton/tree/main/examples
+- **Community Slack**: Join via Hamilton GitHub repo
+
+## License
+
+This plugin is part of the Apache Hamilton project and is licensed under the 
Apache 2.0 License.
+
+---
+
+**Enhance your Hamilton development with AI! 🚀**
diff --git a/docs/ecosystem/index.md b/docs/ecosystem/index.md
index 4c4dd9e8..dfb601fb 100644
--- a/docs/ecosystem/index.md
+++ b/docs/ecosystem/index.md
@@ -118,6 +118,7 @@ Improve your development workflow:
 |------------|-------------|---------------|
 | <img src="../_static/logos/jupyter.png" width="20" height="20" 
style="vertical-align: middle;"> **Jupyter** | Notebook magic commands | 
[Examples](https://github.com/apache/hamilton/tree/main/examples/jupyter_notebook_magic)
 |
 | <img src="../_static/logos/vscode.png" width="20" height="20" 
style="vertical-align: middle;"> **VS Code** | Language server and extension | 
[VS Code Guide](../hamilton-vscode/index.rst) |
+| **Claude Code** | AI assistant plugin for Hamilton development | [Plugin 
Guide](claude-code-plugin.md) |
 | <img src="../_static/logos/tqdm.png" width="20" height="20" 
style="vertical-align: middle;"> **tqdm** | Progress bars | [Lifecycle 
Hook](../reference/lifecycle-hooks/ProgressBar.rst) |
 
 ### Cloud Providers & Infrastructure

(hamilton) 01/01: Add Claude Code plugin for AI-assisted Hamilton development

Reply via email to